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Preface 


I think I have been more confused about the nature of entropy than almost anything 
else I’ve encountered in physics. I remember I was initially mystified by the analysis of 
static forces, and again by the concept of the Green’s function: but entropy still causes 
me to ask myself: do I really understand this? And I don’t think I’m alone. 

For me, the solution to this unease was to teach statistical physics and to fix firmly in 
my mind what message I was to deliver. There were several possibilities. Was I to adhere 
to the information theoretic point of view that so appealed to me as an undergraduate, 
or was I to focus instead on the central role of dynamics, whether deterministic or 
stochastic? Which of the entropies of Boltzmann or Gibbs should I present as more 
fundamental? But these are fairly refined matters, and the message had to address deeper 
issues. Students would inevitably ask ‘what is entropy?’, and I realised that I needed to 
have a simple answer, and that the word ‘disorder’ was not going to do. 

This book takes a look at statistical thermodynamics with the question ‘what is 
entropy?’ very much to the fore. I want to show that up to a point, entropy is actually 
rather ordinary. It is a property of matter, if a little less familiar than energy, pressure and 
density, but connected to them all through the relationships of classical thermodynamics. 
We can measure it with relatively simple equipment such as a thermometer and a source 
of heat. 

Having established this, the change in the entropy of participants in thermodynamic 
processes can be discussed, and then we encounter the not-so-ordinary concept of the 
generation of entropy ‘out of nothing’. So we then develop statistical mechanics to try 
to find a microscopic view of what this quantity might represent, and to explain the 
classical laws of thermodynamics. Along the way, we build a powerful understanding of 
the properties of condensed matter, the traditional realm of application of these statistical 
ideas. But still, what is entropy? 

The answer is uncertainty: entropy is uncertainty commodified. At least this is the 
interpretation that makes best sense to me. We do not or cannot measure all the details of 
the present state of the world and so when processes occur, we are not quite sure what will 
happen, even if we believe that we understand the physics at the microscopic level. Our 
certainty about the future is less than our certainty about the present, unless we are dealing 
with very special systems. This is a matter of intuition and needs to be accommodated in 
any incomplete model of evolution at the macroscopic scale. The increased uncertainty 
is quantified as an increase in the total entropy of the world, and that is what entropy 
is. The most remarkable thing is that we can measure uncertainty with a thermometer. 

But maybe it is not as straightforward as that? Entropy and the second law of 
thermodynamics have been subjects of lengthy discussion over the years. The fact 
that a supposedly basic law of Nature has received repeated attention and fomented 
disagreements for decades, while other laws have been happily absorbed without dissent, 
can indicate several things. The most positive conclusion is that the issue is multifaceted, 
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making it really important and interesting, and well worth the effort of trying to under¬ 
stand it. A less encouraging conclusion is that perhaps people are discussing quite 
different matters, and this has led to confusion. The word entropy has been applied 
to many technical and nontechnical concepts, and we have to be careful what we are 
saying. The property has been discussed in quite abstract and philosophical terms, as 
well as in terms of the hard thermodynamics of the laboratory. The often-quoted advice 
of von Neumann to Shannon, to name his proposed information measure ‘entropy’ on 
the grounds that nobody quite knew what entropy was, illustrates the situation perfectly. 
A great deal has been written about the matter, including some that 1 have not found 
helpful, and this has done nothing to dispel my feeling of unease. 

Anyway, it is my sincere hope that the interpretation presented here will not be viewed 
as unhelpful. I want to provide a treatment that appeals to intuition without leaving too 
many loose ends in the mathematics, employing detailed examples to reinforce the some¬ 
what dry concepts. The book comprises a treatment of classical thermodynamics, with 
the focus particularly on the role played by entropy, and the development of equilibrium 
statistical thermodynamics, all suitable for a second year undergraduate course. Later 
on, I provide a discussion of nonequilibrium statistical physics, in a manner intended to 
secure the idea of entropy as a measure of uncertainty. The dynamics of probability, and 
its application to Brownian motion, are included as lines of development. Towards the 
very end, I discuss fluctuation relations, which seem to me to provide insight into the 
behaviour of thermodynamic systems away from equilibrium, and into the very process 
of entropy generation, since they establish a link with dynamics. 

Nevertheless, the book is quite definitely intended for undergradu¬ 
ates. I assume familiarity with elementary ideas of thermal behaviour 
from an introductory course on the properties of matter, as well 
as exposure to suitable mathematics and the principles of quantum 
mechanics. Some material will be challenging at this level: hence 
an entropy hazard warning sign will appear in a few places! It is a 
short book, and obviously has associated deficiencies in the level of 
detail, particularly in the coverage of experimental support for some of the models. It 
is the focus on the nature of entropy that I hope will set it apart from the many other 
introductory books available on the subject of statistical physics, some of which I refer to 
in Further Reading. Otherwise, the reader might question the need for yet another treat¬ 
ment! On the other hand, I wrote this book to alleviate the personal unease I felt towards 
the concept of entropy, and to reach a position that I felt could be taught and defended; 
whether anyone else can find value in the undertaking is, of course, a huge bonus. 

I would like to express my gratitude to colleagues and students at UCL and elsewhere 
who have stimulated my thoughts on these topics or have offered encouragement and 
advice, in particular Richard Spinney, Brian Cowan, Rainer Klages, Rosemary Harris, 
Paul Tangney and Veronika Brazdova. I thank Roy Axell for introducing me to entropy 
all those years ago and I am grateful to the people at Wiley for this opportunity. 
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Ian Ford 
UCL, December 2012 

Instructors can access PowerPoint files of the illustrations presented within this text, for 
teaching, at http://booksupport.wiley.com. 



1 

Disorder or Uncertainty? 


This book is not a novel, and I think it is acceptable to give away the plot at the very 
outset. Entropy is a thermal property of matter, and when real (as opposed to idealised) 
macroscopic changes take place in the world, the total amount of entropy goes up. This 
is the celebrated second law of thermodynamics, so celebrated, in fact, that saying ‘the 
second law’ alone is often enough to convey which field it relates to. It is due to the 
efforts of Ludwig Boltzmann (1844-1906) and Josiah Willard Gibbs (1839-1903) that 
we now connect thermodynamic entropy with statistical ideas; with the uncertainty that 
prevails in the microscopic state of the world if we have only limited information about 
it. The growth of entropy when constraints on a system are removed, to initiate change, 
is a consequence of an increase in this uncertainty: the number of possibilities for the 
microscopic state goes up, and so does the entropy. 

It is often said that the rise in entropy is related to the natural tendency for disorder to 
increase, and while this can sometimes help to develop intuition, it can be misleading. 
The atoms of a crystalline solid held within a thermally insulated box have evidently 
chosen to arrange themselves as a regular lattice. They might instead have arranged 
themselves as a liquid with the same total energy, but at a lower temperature since some 
of the kinetic energy would need to be converted into potential energy in order to melt 
the solid. But they did not. Nature sometimes has a preference for spatially ordered 
instead of disordered systems: if we set up the system in the molten state, the material 
would spontaneously freeze. 

A better interpretation is that the spatially ordered arrangement of atoms in the solid 
has a larger number of underlying microstates than the cooler, but spatially disordered 
fluid. The disorder in atomic velocities is larger at the higher temperature (and even here 
I would rather say the uncertainty in velocities is larger) and this gives a greater overall 
uncertainty surrounding the actual microstate of the system, when in equilibrium, if the 
atoms are arranged as a solid. The selection rule imposed by Nature for the choice of 
macrostate is to maximise the uncertainty. 

An uncertain situation might convey the idea of disorder or untidiness, but we need to 
take care when we build analogies between entropy and untidy situations. My desk is very 
disordered, but this does not mean that it has more entropy than it would have if I were 
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to tidy it. A disordered desk and a tidy desk are just two particular arrangements of the 
system. But if I defined the term ‘untidy’ to encompass a certain set of arrangements of 
items on my desk, while another, much smaller, set of arrangements is classed as ‘tidy’, 
then I could start to make statistical statements about the likely condition (tidy/untidy) of 
my desk in the future, as long as I had a model of how the arrangement of items changed 
from day to day, as a result of my usual activities. I could define ‘tidy’ such that the 
fraction of desk area showing through the jumble is greater than 75%, say. Then a tidy 
desktop (few configurations, lots of desk showing) would most likely develop into an 
untidy desktop (many configurations, less desk showing) as the days (or even minutes!) 
passed. An untidy desk would probably remain untidy, though its evolution into a tidy 
desk is not beyond all expectation. 

But this is as far as ideas concerning the loss of order and gain of untidiness should be 
taken. A key point is that we could start the process with everything scattered randomly 
over the desk. This is not a tidy or an ordered initial condition. It is, on the other hand, 
a definite initial condition, with no uncertainty attached to it. If entropy is uncertainty, 
then a definite initial state has the same (zero) amount of entropy whether it is tidy or 
untidy, ordered or disordered. It is the certainty in configuration that is lost if we fail 
to follow the details of the desktop dynamics as time progresses, not the tidiness or the 
order. The rise in this uncertainty is equivalent to the increase in entropy. 

As an extension to this reasoning, the initial condition might be that the system is in 
one of a certain number of configurations, perhaps similar to one another, but perhaps 
completely different: an arbitrary collection of my favourite desktop arrangements. Such a 
slightly indefinite initial state would evolve into a more indefinite state: a low but nonzero 
entropy situation evolves into one with a higher entropy. This is a more sophisticated 
description of the evolution of a complex system than a picture of order turning into 
disorder. This is the meaning of the second law. 

Really, discussions of desks or even rooms becoming untidy should include shutting 
the door to the room (and maybe putting up an entropy hazard warning sign!). We leave 
the occupant to rearrange things according to his or her wishes. The configuration of the 
room changes with time and, from the other side of the door, we do not know exactly 
how it proceeds. All we can do is occasionally ask the occupant for some information 
that does not specify the exact arrangement, but instead is more generic, such as how 
much desk is showing. Our knowledge about the state of affairs inside the room is 
steadily impaired, and eventually goes to a minimum, based on what we can discover 
remotely. 

This is how we interrogate a macroscopic system, allowing us to close in on the 
meaning of thermodynamic entropy. The macroscopic equilibrium state of a gas is 
described by a measurable density and temperature, but this is insufficient to specify 
its exact microscopic state, which would be a list of the locations and velocities of all 
the atoms, at least from a classical physics perspective. This is an occasion when admit¬ 
ting ‘I do not know what is going on’ is extremely profound. Thermodynamic entropy 
is a measure of this uncertainty: it is proportional to the logarithm of the number of 
microscopic configurations compatible with the available measurements or information. 
We can categorise those configurations into different classes, such as ‘gas concentrated in 
a corner’ or ‘gas spread out uniformly in the container’, and then estimate the likelihood 
that the system might be found in each class, as long as probabilities for each microscopic 
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configuration are provided. We choose these probabilities on the basis of what we might 
know about the dynamics or by sophisticated ‘best judgement’. 

For an isolated system in equilibrium, equal probabilities for all configurations are 
often assumed, which is perhaps an oversimplification, but it implies that the system 
is most likely to be found in the macroscopic class that possesses the greatest number 
of configurations. If the system were disturbed by the release of some constraint (say 
a change in confining volume), it would eventually find a new equilibrium, and again 
take the class with the most microscopic states. In equilibrium, the macroscopic state 
with the greatest uncertainty is chosen. In this way, an arrow of macroscopic change (or 
of time, loosely) emerges and it is characterised by entropy increase. 

It is sometimes said that the universe is falling to bits, or that everything is going 
wrong, but this a profoundly pessimistic view of the events that we attempt to describe 
with the second law. The statement that disorder is always on the increase carries the 
same gloomy view about the future. But does the interpretation that uncertainty is increas¬ 
ing offer anything more positive? 

The evolution of the universe is a consequence of the rules of interaction between 
the component particles and fields, many of which we have determined in detail in the 
laboratory. These rules recognise no such thing as pessimism or decline. The universe is 
simply following a dynamical trajectory. But one of the core features of the dynamics is 
that transfers take place between participants in a way that seems to favour the sharing 
out of energy or space between them. The attributes of the universe are being mixed up 
in a manner that is hard to follow and our failure to grasp and retain the detail of all this 
is what is meant by the growth of uncertainty. However, we could interpret this failure 
as a reflection of the richness of the dynamics of the world and all its possibilities. We 
could perhaps view the second law more positively as a statement about the extraordinary 
complexity and promise that the universe can offer as it evolves. 

The growth of entropy is our rationalisation of this complexity. We can explain the 
direction of macroscopic change, including events taking place in a test tube as well as 
processes occurring in the wider cosmos, on the basis of a simply stated and implemented 
rule of Nature. We can do this without having to delve too deeply into the microscopic 
laws: it seems that in certain important ways they all have a similar effect. The second 
law is a reflection of an underlying imperative to mix, share and explore, such that 
certain macroscopic events happen frequently, because they are nearly inevitable under 
such circumstances, while others occur more rarely. 

So if we wish to ascribe a motivation to the workings of the universe, instead of 
arguing that the natural direction of change is towards disorder and destruction, we might 
regard the dynamics as essentially egalitarian and, as an indirect consequence, potentially 
benevolent. Particles of a gas with more than their fair share of energy naturally tend 
to pass some to their slower neighbours. Energy will flow, but this does not mean that 
the exceptional cannot arise. The toolbox of physical processes available to the world is 
so well stocked that the flow can be partly intercepted and put to use in building and 
maintaining complex structures. Nature will find opportunities to feed off energy flows 
in extraordinary ways: mixing and sharing seem to have the capacity to build as well 
as to dissipate, at least until the mixing is completed. These are themes that are worth 
developing. 


2 

Classical Thermodynamics 


Our main focus is statistical thermodynamics, but it is important to consider first the 
discipline of classical thermodynamics since it provides a framework and back-story to 
the main developments in the book. In this chapter, we describe the basic rules with 
special consideration given to the role of entropy, and in the next, we enlarge on some 
of the applications. The discussion of statistical thermodynamics starts in Chapter 4. 


2.1 The Classical Laws of Thermodynamics 

Thermodynamics emerged from the empirical science of calorimetry, the measurement 
of the generation and transfer of thermal energy, or heat, and from the development of 
technology to extract mechanical energy, or work, from a heat source. It was then 
extended to include consideration of the properties of matter and transformations between 
phases such as solids, liquids and gases. It is a theory of the macroscopic transfer of heat 
and mass, events that are known as thermodynamic processes. Strictly the focus of the 
theory is on systems that are in thermal equilibrium, the situation reached when all the 
processes have ceased. It is summed up in the four classical laws of thermodynamics, 
which are statements of empirical observation: 

Zeroth law If two systems are in thermal equilibrium with a third system, then they 
are in equilibrium with each other; in fact there is a single system property (called 
temperature) that serves to indicate whether systems are in thermal equilibrium; 
First law There is a system property called energy that is conserved, but can take several 
different forms that interconvert; 

Second law There is a system property called entropy that, if the system is isolated from 
its environment, either increases or (in principle) remains constant during thermody¬ 
namic processes; 

Third law The entropy of a system is a universal constant (set to zero) at the absolute 
zero of temperature. 
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Entropy appears in two of these laws, and is a central concept in thermodynamics. It has 
acquired a reputation for being hard to understand, and for this reason, entropy will be 
the focus of the discussion of classical thermodynamics in this chapter. Energy is a much 
more familiar concept: we buy it, we ‘use’ it and we read about it on food packaging, 
but it is possible to develop some intuition for entropy as well. 

In the early development of classical thermodynamics, there was little fundamental 
understanding of what entropy actually represented. This situation was transformed when 
Boltzmann and Gibbs (and others) invented statistical mechanics towards the end of the 
nineteenth century, although there are still controversies to this day. To repeat the claim 
made in the previous chapter, it can be understood to represent uncertainty, and, in a 
limited sense, disorder - a lack of information about the detail of a system. Its evolution 
has been associated with the winding down of the universe after the initial impetus of the 
Big Bang. Philosophers suggest that it plays a role in our perception of the directionality 
of time. A startling set of notions to emerge from the simple science of calorimetry and 
the technology of the steam engine! 

We shall see in later chapters what the fundamental insight of statistical mechanics was, 
and understand why it is written, in mathematical notation, on Boltzmann’s gravestone. 
However, we can get a general feel for entropy by studying a simple example, before 
extending to more general systems. An example can also provide us with a grounding 
in the sometimes confusing concepts of heat, work and energy. We shall consider the 
ideal, monatomic, classical gas or ideal gas for short. 


2.2 Macroscopic State Variables and Thermodynamic 
Processes 

The statement of the first law of thermodynamics conveys to us something of the nature 
of thermodynamics and the phenomenology of classical thermodynamic processes. It 
concerns the conservation and interconversion of energy, an example of a parameter, 
variable or function of state (a quantity that specifies the macroscopic condition of a 
physical system when it is in equilibrium). Other examples include pressure, temperature 
and volume: all measurable and familiar in macroscopic physics. We shall call them state 
variables. They describe the equilibrium condition of a system without reference to any 
previous history. 

By equilibrium, we mean that there is no time dependence in the condition of a 
system, which includes the absence of fluxes of energy or matter through the system. A 
thermodynamic process can be a transfer of energy or matter into or through a system, 
or some internal change such as freezing, often brought about by a change in one of the 
constraints imposed on it, that has the ultimate effect of altering one or more of the state 
variables of the system. 

There are two types of macroscopic state variable. There are those that are proportional 
to the amount of material in the system, such as energy, that we called extensive, and 
those such as temperature that do not change if we replicate a system to make a larger 
one: these we call intensive. Further examples are given in Figure 2.1, which also sketches 
the ‘world-view’ that we take in thermodynamics. According to this view, we focus our 
attention on the behaviour of a system, which could be a flask of helium, a lump of 
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system 1 


system 2 

E, V,N 


2 E, 2V, 2 N 

T.p.p 


T,p,p 


Environment 

energy E x temperature T r 

volume Vj. pressure p r 

particle number N r chemical potential // r 


Figure 2.1 The world-view according to thermodynamics. An environment is characterised by 
the macroscopic properties labelled by a suffix r. The one that might be unfamiliar is chemical 
potential, which we discuss later. Systems coupled to this environment are characterised by similar 
properties, shown here without a suffix. System 2 is simply two copies of system 1 joined together. 
Intensive state variables do not change under such replication, but extensive variables double. 
Furthermore, when in equilibrium, it is the intensive state variables of the system that normally 
equal those of the environment, for reasons that we shall come to later. 


steel or a bottle of milk, and regard everything else as the environment, characterised 
by just a few parameters and an ability to exchange various quantities with the system. 
The environment is often assumed to be very large in extent compared with the sys¬ 
tem of interest. 

In thermodynamics, attention is often given to the internal energy, defined to be the 
total energy of a system minus any bulk translational or rotational kinetic energy, and 
minus the potential energy due to any externally imposed fields, such as gravitational 
energy. It therefore comprises just the kinetic and potential energy of internal motion 
or interactions. We shall find it more convenient, however, to work with the sum of the 
internal energy and any externally imposed potential energy. We shall use the symbol E 
to represent this energy. 

Energy conservation is a rather fundamental principle in physics, and the energy of 
a system may therefore be changed only by transfers from the environment brought 
about by heat flow, for example, or by distorting it using a mechanical force and thereby 
performing work. It helps perhaps to regard these as transfers of kinetic and potential 
energy, respectively. Work is an energy transfer brought about by the application of 
an external force of some kind. It corresponds to a transfer of potential energy from 
the environment, such as the fall of a weight under gravity to move the piston that 
compresses a gas. Heat transfer is an energy change brought about by passing molecular 
kinetic energy into a system, through collisions at an interface, for example. Then the 
first law states that the state variable E can receive incremental contributions from the 
environment in the form of heat AQ and work AW. We then write the first law of 
thermodynamics in the form AE = AQ + AW. 

Since AQ and AW represent incremental changes in energy of the system associated 
with different transfer processes, they do not represent increments in (purported) state 
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variables Q and W: a system does not contain specific quantities of heat or work, only 
a certain energy. As a reminder, some treatments use d Q and d W when specifying heat 
and work increments, and refer to them as inexact differentials. We shall not use this 
notation. As long as we grasp that d Q and d W are increments of energy that specify 
the course of a certain process while d E is the increment in the energy state variable 
resulting from the process, the likelihood for confusion in the meaning is minimal. We 
can certainly integrate increments d Q to obtain the heat transfer A Q — f dQ over a 
process, just as we can calculate changes in state variables such as A E = f dE, but 
we always note that A Q is not a difference in a state variable, while A E is. The heat 
transfer might depend on the specific sequence of connections made to sources of heat 
during the thermodynamic process, but a state variable is independent of the previous 
history of a system, and therefore a change in state variable does not depend on the 
thermodynamic path taken between initial and final states. 

It is worth pointing out that the conservation of energy embodied by the first law 
holds whether the initial and final states are in or out of equilibrium. However, most 
state variables in thermodynamics describe systems that are in equilibrium. For example, 
the state variable temperature, which is mentioned in the zeroth law of thermodynamics 
as an indicator of whether two systems are in thermal equilibrium, most definitely is an 
equilibrium property. Of course, we frequently apply the concept of temperature to a 
system when it is heating up or cooling down, and therefore out of equilibrium, but this 
view only really holds if the system is only mildly perturbed away from equilibrium, 
which means that heat flows should not be too large. In the same way, the state variables 
pressure and entropy are properly ascribed only to equilibrium states, but in certain 
circumstances, the concepts can be stretched to apply to nonequilibrium situations, which 
we return to briefly in Section 2.14 and again in Chapter 15. 


2.3 Properties of the Ideal Classical Gas 

We shall frequently use the monatomic ideal classical gas to illustrate aspects of ther¬ 
modynamics. An ideal gas consists of particles that do not interact with each other, but 
only with the walls of the container in which they are confined. The equation of state of 
the ideal classical gas is 

pV = NkT, (2.1) 

where p is the pressure, V is the volume, N is the number of particles and T is the tem¬ 
perature. This is also known as the ideal gas law. The pressure, volume and temperature 
of a gas characterise its equilibrium state, and satisfy the equation of state, irrespective 
of whether the state was established by compressing, expanding, heating or cooling a 
previous state. The remaining symbol in (2.1) is Boltzmann’s constant k, which is numer¬ 
ically equal to 1.38 x 10“ 23 JK 1 . This equation involves the concepts of pressure and 
temperature; so even though they might be very familiar to us, we should consider the 
nature of these state variables and what they mean empirically. 

Pressure is readily interpreted by picturing a gas as a collection of many particles, 
with a range of velocities such that they collide with each other and with the walls of 
the container. Pressure is the average normal force per unit area exerted on the wall 
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Figure 2.2 All gas particles with a positive velocity v x , and located within a distance v t d t 
of a wall, collide with it in a period of time df, giving rise to momentum transfer and hence 
pressure. 


and can be calculated in the following way. Imagine that a particle of mass m and 
velocity in the x-direction v x collides with a wall and is reflected perfectly, as illustrated 
in Figure 2.2. The change in particle momentum in the x-direction is —2mv x \ so the 
momentum transferred to the wall is 2 mv x . This is not the only particle that hits the 
wall in time dr. All particles with positive velocity v x and situated less than a distance 
v_,.d t from the wall will make a collision; so the momentum change in the time interval 
is 2mv x n(v x )dv x Av x dt where n(v x )dv x is the number of particles per unit volume with 
velocity between v x and v x + dv v and A is the wall area. The force on the wall is the 
rate of transfer of momentum to it, according to Newton’s second law; so we divide this 
expression by dr and sum over all positive velocity cohorts, giving a force 



where n + = / () °° n(y x )dv x is the total number density of particles with positive v x (i.e. 
travelling towards the wall). In equilibrium, this is just nl 2, where n — N/V is the total 
particle number density. 

The brackets in the final expression indicate an average defined with respect to the 
weighting n{y x )dv x /n + , which is essentially a probability that a particle should have an 
x-component of velocity in the region of width dv x around dv t . Dividing by the wall area 
gives the pressure in the form p = mn{v x ). Then we note that in an equilibrium state, 
the mean square velocity components are time independent and statistically equivalent, 
as there is no mean flow, which implies that (v 2 ) = (v 2 ) = (v 2 ) = (l/3)(v 2 ) where v is 
the magnitude of the velocity, in which case we obtain 


pV — 5 Nm{v 2 ). 


(2.3) 


Next, we relate the energy of an monatomic ideal gas in equilibrium to the same 
statistical property (v 2 ) of the particle velocity distribution. The energy is entirely kinetic, 
as there are no interactions between the particles, making the potential energy zero. 
The mean energy of one atom is (£j) = (l/2);n(v 2 ) where m is the atomic mass. The 
mean energy of N atoms, assuming them to be statistically independent, is ( E N ) = 
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(l/2)Nin(v 2 ). Finally, we write E = (E N ). There should be a correction to exclude bulk 
translation and rotation, but let us assume that there are many particles and that the 
correction is negligible. Thus 

E=\Nm{v 2 ), (2.4) 

and combining (2.3) and (2.4) we obtain 

pV = \E, (2.5) 

which is a connection between the pressure and energy of an ideal gas confined to a 
container of volume V. 

Together with (2.1), this implies that 

E = \NkT, (2.6) 

for the monatomic ideal classical gas, which suggests that the state variable temperature 
appearing in the ideal gas law is a measure of the mean kinetic energy per particle of 
a system in thermal equilibrium. Temperature is also supposed to be an indicator of 
thermal equilibrium between systems and we can see how that operates in this example. 
Boyle’s law, an empirical property of a rarefied gas, simply states that the product pV 
does not change for an ideal gas after a compression or expansion, as long as the initial 
and final states are in thermal equilibrium with a given environment. It makes perfect 
sense that the ideal gas law should equate this product to the expression NkT such that 
ideal gases with the same value of T are in thermal equilibrium, or isothermal. 

In this way, Boyle’s law provides us with an empirical temperature scale. An ideal 
gas can act as a thermometer through the value of pV it acquires when placed in thermal 
equilibrium with different environments. In order that the reading on the thermometer 
should not depend on how much ideal gas we use, we should make the temperature 
scale a function of the value of pV/N, which would be an intensive state variable. 
For convenience, the quantity pV/Nk for an ideal gas in thermal equilibrium with an 
environment consisting of pure water at its triple point of equilibrium between the solid, 
liquid and gas phases is used to define a reference temperature of 273.16 K. If the ideal 
gas is brought into thermal contact with an environment that is not isothermal with the 
water triple point mixture, then the equilibrium value of pV/Nk for the ideal gas will 
differ from 273.16 K and will serve to denote the temperature of the environment. We 
could have used the combination pV/N as an empirical temperature measured in joules, 
but the benefit of retaining k is that we can celebrate Boltzmann’s contribution to gas 
physics by having a constant named after him. 


2.4 Thermodynamic Processing of the Ideal Gas 

Now we consider some simple thermodynamic processes involving the compression of 
an ideal gas, possibly with heat transfer between the system and an environment at a con¬ 
stant temperature. Arbitrary rates of heat transfer or compression will in general disturb 
the thermal equilibrium between a system and its environment, and strictly temperature 
can only be determined once equilibrium has been restored. But we can imagine that 
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the process is conducted very slowly, such that thermal equilibrium between system 
and environment can be approximately maintained throughout, and the compression of 
the system then proceeds through a sequence of isothermal equilibrium states. Alterna¬ 
tively, the gas could be thermally isolated from the environment and slowly compressed 
through a sequence of equilibrium states each with a well-defined temperature. This 
idealisation of a thermodynamic process is often invoked, and is called a quasistatic 
process. Otherwise, a process is said to be nonquasistatic. 

Let us consider the quasistatic performance of mechanical work that brings about the 
compression of a gas in a cylinder. The work done on the gas is simply the applied 
force / times the distance moved by the piston dr. It is a transfer of potential energy 
from the environment. If the compression is slow, the gas is always well approximated 
by an equilibrium state, and we can assume that it exerts a uniform pressure p against 
the piston head, of area A, and that the force pA equals the applied external force /. We 
write/dr = — (f/A){—AAx) = —pdV, where AV is the change in system volume (here 
negative) brought about by the compression. 

If the piston were moved nonquasistatically, various complications would ensue: the 
pressure of the gas might not be uniform, the applied and resistive forces might not 
balance, and shock waves, convective motion or sound might be generated. All this 
makes such a process hard to analyse. 

For a quasistatic compression, in contrast, and in the absence of heat transfer, we can 
simply state that AE = AW = —pdV. We then proceed using (2.5): 

AE = \A(pV) = \(pAV+ VAp) =-pAV, (2.7) 


so 

5AV _ 3 Ap 

2 ~V~ ~ 

such that (5/2) In V — — (3/2) In p + constant, and hence 

5 

pV 3 = constant, 


( 2 . 8 ) 


(2.9) 


which describes the quasistatic, adiabatic (meaning thermally isolated) compression of 
a monatomic ideal classical gas on a p — V diagram. It may be contrasted with the 
condition pV = constant for quasistatic isothermal compression, associated with (2.1), 
where the temperature is held constant by allowing heat transfer between the system and 
the environment. By inserting the equation of state (2.1), the adiabatic compression can 
be represented on a T — V diagram as TV 2 ^ — constant, indicating that the temperature 
rises during the compression. But is anything held constant? We shall see. 

Consider next the change in energy due to a process of heat input, with the volume 
held constant so that no external work is done on the system, and once again with 
the number of particles in the system fixed. We now write the first law as d£ = AQ. 
When heat is injected into a system, we expect its temperature to change since it is 
being disturbed from a previously isothermal state. It is of interest to calculate a heat 
capacity, defined as the amount of heat required to raise the temperature of the system 
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by a specified amount. For the monatomic ideal classical gas, we have 


d Q 
d T 


d E 
d T 


d 

dT 




( 2 . 10 ) 


and this is referred to as the heat capacity at constant volume, denoted C v and defined 
in general by 


If we relax the condition of constant volume and allow mechanical work dW to take 
place during the process, we write instead 


dE 

97 


N.V 


( 2 . 11 ) 


dQ = dE - dW. 


( 2 . 12 ) 


Note that the convention employed throughout is that dQ and d W denote heat and 
work energy given to the system. Assuming the process is quasistatic, dW = —pdV and 
the system temperature remains spatially uniform. If we divide dQ by the change in 
temperature that accompanies the process we get 


dQ 
d T 


dE 
d T 


dV 
’d T 


(2.13) 


The label q is there to emphasise that the heat input is made quasistatically. We do not 
add a quasistatic label to the derivatives of energy and volume with respect to temperature 
because all the quantities involved are state variables that characterise equilibrium states. 
By definition, the change in a state variable does not depend on the rate at which we 
conduct the process (as long as we let the final state come to equilibrium, of course). The 
incremental changes in energy, volume and temperature brought about by the process are 
independent of the history. In contrast, the delivery of heat is process specific, and only 
for a quasistatic process can the ratio dQ/dT be related to the particular form in (2.13). 

If we imagine that the delivery of heat to the system is brought about by thermal 
contact with an environment with a quasistatically increasing temperature T r (t), then 
the temperature of the system T will take on the same time dependence. The pressure 
evolves according to the equation of state, and remains spatially uniform, such that there 
are no convection currents, thermal or pressure gradients induced during the process, 
and the energy then changes according to (2.6), fully specifying the right hand side of 
(2.13). We conclude that if quasistatic work is performed while the temperature is raised, 
then the amount of heat drawn from the environment will be affected: in short, the heat 
capacity of the system will differ from the constant volume case. 

So heat capacity is a generic term, and it depends on the conditions that are imposed. 
It is usual to consider a particular case where external work is performed at constant 
pressure. With p held constant during the process, the last term in (2.13) becomes 
d(pV)/dT = d(NkT)/dT — Nk, and the heat capacity, at constant pressure, of the 
monatomic ideal classical gas is then 


C„ = %Nk. 


(2.14) 


Note that C p /C v — 5/3. This ratio is denoted y for more general systems that we shall 
consider in Section 3.5. Also note that this ratio is the same as the exponent in (2.9) 
describing adiabatic compression. It turns out that this is no coincidence. 
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2.5 Entropy of the Ideal Gas 


We now consider the thermodynamic processing of a monatomic ideal classical gas that 
leads to the concept of thermodynamic entropy. We transfer heat quasistatically from an 
environment into a system and consider the rather innocuous looking quantity (dQ/T) q 
that characterises an incremental stage in such a process. It is the heat transfer to a 
system modulated by its temperature, and the latter is equal to the temperature of the 
environment, since the change is quasistatic. The environmental temperature is assumed 
to change slowly and produce a consequent slow evolution in the properties of the 
system. Using the first law, we can write 


d Q\ _ d E + pdV 

~Y)= j— 


and therefore, using the energy-pressure relation, 


/dg\ _ 3 d (pV) dV 
V T ) q 2 T P T 


5 dV 3 dp 
-p — + -V — 
2 T 2 T 


5 dV 3 dp 

-Nk -b -Nk — . 

2 V 2 p 


(2.15) 


(2.16) 


If we sum such incremental changes over a complete quasistatic heat transfer process 
from equilibrium state a to equilibrium state b, we get 



5 3 

b 

3 / 5 \ 

-Nk In V + -Nk In p 
2 2 

a 

-Nk In | pV 3 J 


(2.17) 


The left hand side is a measurable quantity: the heat transferred to the system during a 
quasistatic process modulated by the changing inverse system temperature as the process 
takes place. The right hand side can be expressed in terms of the initial and final pressure 
and volume of the gas ip a ,V a ) and (p b ,V b ), noting that N remains the same, and we write 




= S(p b ,V b ,N)-S(p a ,V a ,N) 



(2.18) 


where we have defined a property of the monatomic ideal classical gas called entropy, 
a name coined in 1865 by Rudolf Clausius (1822-1888), that takes the form 


S(p,V,N ) = Nk In 



(2.19) 


The entropy S of a monatomic ideal classical gas is a state variable, since it is a function 
of state variables pressure and volume, and we shall see that it turns out to have some 
rather special properties. The quantity C in the denominator is included to make the 
argument of the logarithm dimensionless. It can, in principle, depend on system properties 
that do not change as a result of the process, and the only one in this case is N, the 
number of particles, and so we write C(N). 

Note that in order to raise the temperature of twice the amount of gas, we would need 
twice the amount of heat. The change in entropy of the gas resulting from the process 
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of quasistatic heat transfer dS = (d Q/T) q is therefore also proportional to the amount 
of material; so entropy is extensive, like the volume, energy, or the number of particles 
itself, while pressure and temperature are intensive. Thus we require 

( ph 2V)§\ (pM\ 

S(p,2V,2N) = 2S(p,V,N ) =>21Vtln ' g( 1 = 2AT In I I , (2.20) 


and we deduce that C scales in a particular way with system size: we write C = cN - 
where c is now independent of the state variables p, T, N and V. 

There are alternative forms for the entropy of the ideal gas, obtained by inserting the 
relation between pressure and energy: 


S(E,V,N ) = Min 



or by inserting the equation of state 


S (T, V,N) — Nk In 


(kT)2 \ 

cN/V ) 


( 2 . 21 ) 


( 2 . 22 ) 


The entropy per particle of the gas clearly increases as the temperature increases or as the 
density n — N/V decreases, and we start to acquire some intuition about the behaviour 
of this new state variable. 

So entropy is a property of the gas. It is not some vague, hard-to-understand concept: 
it can be written as a perfectly well-defined function of state variables. We have focussed 
on the ideal gas example in order to demonstrate this explicitly. For more complicated 
systems, such as a liquid or a solid, it is not always so easy to derive a mathematical 
expression for entropy but nevertheless it can be calculated, in principle, through the 
defining relationship for entropy differences: 



= S(b)-S(ci). 


(2.23) 


We call the left hand side the Clausius integral and this is the Clausius expression 
for entropy change. It is easy to measure entropy (or rather differences in entropy) 
experimentally using this relationship, essentially using a thermometer. In addition, notice 
something special about this definition. We have shown explicitly for an ideal gas that S 
is a function of state variables and is therefore a state variable itself. However, we see 
that it is related to a sum of increments of a process variable, namely the heat transfer. 
A system does not possess a quantity of heat Q, as we have emphasised earlier: it only 
receives incremental contributions d Q to its energy E during a thermodynamic process. 
Summing the d Q over the quasistatic process from state a to state b will in general 
produce a A Q that depends on the thermodynamic path, specifically the history of the 
compression, expansion and coupling to heat sources that is taken between them. But by 
summing the dQ/T, we obtain something that is not specific to the path, meaning that 
it is a difference in a state variable S. Dividing the inexact differential dQ by the state 
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variable T produces an increment, known as an exact differential, in the state variable S. 
At the moment, we must regard the extension of all this to materials other than ideal 
gases as a conjecture, but a proof using the machinery of Carnot cycles will be given in 
Section 2.9. 

Finally, we use (2.17) to note that p V 5//3 = constant describing the quasistatic adiabatic 
compression of the monatomic ideal gas is nothing more than the condition S — constant. 
Quasistatic adiabatic compression of a system, characterised of course by d Q = 0 at 
every incremental stage of the process, is isentropic (constant S ) by analogy with an 
isothermal compression at constant T. 

We have established the entropy of an ideal gas explicitly, but what are those special 
properties referred to earlier? Now it is time to find out. 


2.6 Entropy Change in Free Expansion of an Ideal Gas 

The core message of the second law of thermodynamics is that the entropy of an isolated 
system cannot decrease in spontaneous thermodynamic processes: those brought about, 
usually, by the release of a constraint. In almost all circumstances it increases, but in 
some special cases it can remain the same. Let us illustrate this with examples. 

We first consider what is called a free, or Joule expansion of an ideal gas. Initially 
the gas is held inside a container of volume V 0 at pressure p {] and temperature 7 0 . The 
container is situated inside a larger volume V ] that is otherwise empty and thermally 
isolated from the environment. The container bursts and an expansion into Vj takes place, 
involving gas flow, shock waves, pressure gradients, sound generation and so on and the 
process is nonquasistatic. The process is illustrated in Figure 2.3. When everything has 
settled down into a new equilibrium state, the final entropy can be identified as: 

S(T 1 .«,K l ) = Ml„(gLLy (2.24) 



V l ,T 1 ,p l 


Figure 2.3 A container of volume V 0 holds a gas in equilibrium at temperature T 0 and pressure 
p 0 . It then ruptures, giving rise to the free expansion of the gas into the larger, thermally isolated 
volume Vp and once equilibrium has been restored, the temperature and pressure are 7j and p v 
respectively. 
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while the initial entropy was 


S(T 0 ,N,V 0 ) = Nk In 


/ (frr 0 )! \ 

\£n/v 0 ) 


(2.25) 


Since the gas does no work in expanding against a vacuum, and no heat is supplied 
from the environment since the outer container is thermally isolated, there is no change 
in the system energy. This implies that the final temperature is the same as the initial 
temperature, because the energy is given by E = (3/2)NkT and so the entropy change is 

AS =S(T 0 ,N,V l )-S(T 0 ,N,V 0 )=Nk\n(^y (2.26) 

Note that the unknown constant c does not appear in a difference of ideal gas entropies. 

Clearly, the system entropy has increased as a consequence of the free expansion 
since V) > V 0 , which is our first example of the second law in action. But this is a new 
aspect of entropy: we conclude that it does not change solely as a result of quasistatic 
heat transfers, as implied by (2.23). We related the change in entropy of a system to an 
incremental quasistatic heat input by 


dS = 



(2.27) 


but this has to be modified for a nonquasistatic process such as free expansion because 
in such cases there is entropy change but no heat transfer. We consider instead the 
expression 

d Q 

dS = ^+dS,-, (2.28) 

r 


where T r is the temperature of the heat source and the corresponding version for a finite 
change AS — f d Q/T r + AS, . 

As suggested in Figure 2.1, in thermodynamics we often invoke an environment that 
exchanges heat with a system, but which has an infinite heat capacity. Such environments 
are often called heat baths , or reservoirs (hence suffix r). In (2.28) we consider the 
transfer of heat d Q to a system, delivered nonquasistatically, which changes the system 
temperature but without affecting that of the reservoir. 

Once the system has settled down into a new equilibrium state, with time-independent 
properties, its entropy change is dS. However, this does not necessarily match the quantity 
dQ/T r (note the absence of the label q, indicating that it is a nonquasistatic process) and 
the discrepancy is denoted dS,. Equation (2.28) therefore defines dS,, which we shall call 
the internal change in system entropy, in contrast to the first term in (2.28) that involves 
heat flow from the external reservoir. In the case of free expansion, there was no heat 
transfer and the change in system entropy was entirely internal; so the corresponding 
expression would be AS = AS), such that AS, = Nk \n(V l /V 0 ). The ‘natural’ direction 
of change of the system brought about by the rupture of the container is accompanied 
by a positive AS, . We need to investigate further the properties of this contribution. 
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2.7 Entropy Change due to Nonquasistatic Heat Transfer 


Consider a process of heat exchange between a reservoir at temperature T r and a 
monatomic ideal classical gas initially at temperature T 0 . The system variables N and V 
are fixed. No work is done on the system and the heat transfer is nonquasistatic. After 
equilibrium is reestablished, we assume that the system has acquired the temperature T r , 
according to the zeroth law, and the change in system entropy is 


AS = S (T r ,N, V) — S (T 0 ,N, V) = Nk In 


(kT r )2 

cN/V 


-Nk In 


(kT 0 p 

cN/V 


3 T r 
= -Nk In —. 


(2.29) 

We could be heating or cooling the system, so this entropy change could be positive or 
negative, depending on whether T r is greater than or less than T 0 . A positive or negative 
change in system entropy is not in conflict with the second law as the system is not 
isolated. Of more significance is the sign of 


AS: = AS 




A E 3 

-= -Nk 

7,. 2 



(2.30) 


where we have employed (2.29), the first law A Q — AE describing the heat transfer and 
system energy change during the nonquasistatic process, and E = (3/2)NkT. It is crucial 
to notice that the contribution A5,- to the system entropy change is positive whether 7 0 
is greater than or less than 7’ r , for both cooling or heating, as illustrated in Figure 2.4. 

A Sj is the interned change in entropy and is a consequence of the nonquasistatic nature 
of the heat exchange between the environment and the ideal gas. It is never negative for 
this process. It is also referred to as the dissipative contribution to entropy change, or sim¬ 
ply as entropy production. It is the central player in the second law of thermodynamics: 
it is the source that causes the total entropy to increase in a spontaneous nonquasistatic 
process. The entropy of an ideal gas can be increased or decreased, for example, by 
heating or cooling, but the internal entropy change seems never to be negative. 



T 0 /T t 


Figure 2.4 The dimensionless internal entropy change AS ; /[(3/2)lVfc] associated with nonqua¬ 
sistatic heat transfer between a reservoir at temperature T r and an ideal gas system initially at 
temperature T 0 , assuming that they eventually become isothermal. 
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But how are we to interpret AS, ? This is not straightforward. The most we can say 
at this point is that it is associated with heat flows and the transient departure from 
equilibrium brought about by differences in temperature between the system and envi¬ 
ronment. Only when such differences and flows become infinitesimal, and the rate of 
the corresponding process quasistatic, can entropy production be eliminated. If the tem¬ 
perature difference T r — T 0 = 8T is small, we can insert the approximation In (T r /T {] ) = 
— ln(l — 8T/T r ) Rs ST/T r + (l/2)(8T/T r ) 2 in (2.30) to demonstrate that the correspond¬ 
ing internal entropy change is 8S t oc (ST) 2 : second order in the initial temperature 
difference. This is enough to justify the claim that A .S',- = 0 for a quasistatic process, 
during which the temperature of the environment is changed extremely slowly, allowing 
only tiny temperature mismatches between the environment and system. 

We might point out that heat bath temperatures are meant to remain constant; so to 
be more explicit, we should represent a general quasistatic heat transfer process as a 
sequence of thermal equilibrations of the system with a set of heat baths at various 
slightly different temperatures, each of which brings about a small change 8T in sys¬ 
tem temperature and a small contribution SS, to internal entropy change. The sum of 
the ST makes a finite overall temperature change AT ~ jC 8T but the overall internal 
entropy change is AS, ~ SS, oc ^ (ST) 2 , and this vanishes since a sum of second 
order infinitesimal contributions is negligible. Such gentle coupling of the system to 
slightly warmer or colder heat baths never produces significant heat flows and so the 
process does not lead to internal entropy change. 

Consider now the entropy change of the reservoir that exchanges heat with the system 
in Figure 2.4. The reservoir is supposed to be large enough that the exchange of heat 
does not affect its temperature. It remains in equilibrium and hence suffers no internal 
entropy change. This is a defining feature of a reservoir: of course it is an idealisation 
of a real source of heat, but a very useful one. Its change in entropy in the heat transfer 
process is simply A .S’,. = —AQ/T t , the negative sign indicating that the transfer of heat 
to the reservoir is equal and opposite to the transfer of heat A Q to the system. Since the 
change in entropy of the system is AS = AS, + A Q/T r , we see that the exchange of 
heat between the reservoir and the system simply transfers entropy from one to the other. 
The change in entropy of the ‘universe’, here taken to be the combination of the ideal gas 
system and the reservoir, is AS tot = AS + AS,, and this is just AS tot = AS, > 0. All the 
entropy production associated with the process takes place in the system, though once it 
is generated, it can be transported along with heat into the reservoir. The second law can 
be cast as ‘changes in equilibrium state brought about by the removal of a constraint and 
the subsequent spontaneous heat flow between components of an overall isolated system 
will always be accompanied by an increase in the combined entropy’. Clausius put it 
more succinctly in 1865: Die Entropie der Welt strebt einem Maximum zu: the entropy 
of the universe evolves towards a maximum. 

The claim that AS, is never negative is axiomatic: it is a statement of the second 
law. It turns out that it rationalises a host of empirical observations, some of which we 
have already seen with the ideal gas. It is a very powerful principle for determining the 
direction (or more properly the destination) of change in spontaneous thermodynamic 
processes; that heat should flow to equalise temperatures, for example. We cannot prove 
it from the other laws of classical thermodynamics. Next we demonstrate how it emerged 
historically from empirical studies of heat transfer and engine design. 
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2.8 Cyclic Thermodynamic Processes, the Clausius 
Inequality and Carnot’s Theorem 

We consider taking the ideal gas around a cyclic heating and cooling process driven by 
a time-dependent reservoir temperature T r (t), or more properly a sequence of reservoirs 
with slightly different temperatures. We assume the initial and final situations are at 
equilibrium but the process need not be quasistatic. Cyclic means that the conditions at 
the end of the process are the same as those at the start. The system state variables are 
returned to their initial values, and this includes the entropy. We integrate (2.28) to give 



(2.31) 


where the time dependence of the reservoir temperature has been made explicit, and 
where the integration sign implies a cyclic process. Since S fina i = .S' lllllia | for a cycle and 
A Si > 0, this means that 



(2.32) 


which is known as the Clausius inequality. It is a practical statement of the second law 
in terms of thermal driving and heat transfers. It is an equality for quasistatic process 
conditions, for which AS, = 0, and where we can replace the reservoir temperature by 
the system temperature, and write (f (d Q/T) q = 0. 

The Carnot cycle is a famous example of a quasistatic cyclic process, named after 
Sadi Carnot (1796-1832). In its simplest form, it is a sequence of quasistatic expansions 
and compressions of an ideal gas, as illustrated in Figure 2.5. The first stage in the cycle 
is an isothermal expansion in contact with a hot reservoir. This moves the system along 
a path known as an isotherm from the top left to the top right of the cycle shown. The 
second stage is an adiabatic expansion from top right to bottom right, taking the system 
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Figure 2.5 Sketch of the quasistatic isothermal and adiabatic expansions and compressions that 
make up a Carnot cycle operating on an ideal gas. 
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along a path known as an adicibcit or isentrope. The third is an isothermal compression 
in contact with a cold reservoir, and the fourth is an adiabatic compression that returns 
the system to its original state. 

The cycle is driven by a time-dependent environmental pressure, synchronised with the 
thermal coupling and decoupling between the system and the two reservoirs. Since it is 
quasistatic, there is no entropy generation during the process, and furthermore the change 
in system entropy for one complete cycle should be zero. This can easily be demonstrated 
using the entropy function (2.22) and the adiabatic condition TV 2A = constant. The 
entropy of the gas changes only during the isothermal expansion and compression stages 
of the cycle; so the change over a cycle is 
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(2.33) 


in the notation used in the diagram, and since T h V 2 3 = T C V 3 A and T h V' | 2 ' 3 = T C V^ 3 , 
we have V 2 /V 3 = V 3 /V A and AS vanishes. 

The cycle is designed to convey heat from the hot to the cold reservoir, with the 
conversion of some of this flow into mechanical work, obtained by employing the sys¬ 
tem expansions and compressions to move a load, for example. The hot reservoir passes 
heat A Q h into the system during the isothermal expansion from V l to V 2 . Since A E = 0 
for the system during this expansion, this heat is converted into work performed on the 
environment, which is equal to /J” 2 pdV = NkT h \n(V 2 /V i ). Similarly, the cold reservoir 
receives heat A Q c equal to the work done on the system during the isothermal compres¬ 
sion, which is — fy 4 pdV = —NkT c In (V A /V 3 ) = NkT c ln(V 2 /V l ). The work done on 
the environment per cycle is AW = AQ h — AQ C — Nk(T h — 7’ t ) \n(V 2 /V \). We have 
an idealised device that converts heat into work, a type of heat engine known as a 
Carnot engine. Furthermore, it can operate in both directions: the input of mechanical 
work can pump heat from the cold to the hot reservoir, like a refrigerator. 

The sequence can be used to investigate the efficiency j? c of a Carnot engine. This 
is the amount of work produced as a proportion of the amount of heat taken from the 
hot reservoir, or AW/AQ h . Carnot demonstrated in 1824 that this efficiency depended 
solely on the temperatures of the hot and cold heat reservoirs: this is known as Carnot’s 
theorem. Interestingly, he obtained these results by using what was called the caloric 
theory of heat, which is now discredited. Using our present methods we can write 


rjc = 


AW _ 1 T c 

A Qh T h 


(2.34) 


The point here is that the efficiency is always less than 100%: there is always waste 
heat A Q c transferred to the cold reservoir. The most efficient engines need to operate 
between as great a temperature difference as possible. Carnot went on to show that any 
engine based on the quasistatic expansion and compression of an arbitrary substance had 
the same efficiency as the equivalent ideal gas Carnot engine or engines. In actual fact, 
most heat engines in Carnot’s time were dreadfully inefficient, and the upper limit was 
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nowhere near attainable. But the implications of his analysis were far reaching. In the 
next section, we show that a consideration of cyclic processes reveals that any real engine 
operating nonquasistatically has a lower efficiency than if it operated quasistatically, and 
that this reduction in efficiency is associated with the generation of entropy. For this 
reason, Carnot is regarded as the father of the second law. 


2.9 Generality of the Clausius Expression for Entropy Change 

First, we consider a Carnot cycle operating on a material, or working substance, for 
which we do not know an equation of state, or the heat capacities, or indeed whether 
an entropy state variable may be defined at all. Nevertheless, we can take it through 
a cycle of compressions and expansions. Let us actually consider a reversed cycle: 
an isothermal compression at 7’ h , adiabatic expansion, isothermal expansion at T c and 
adiabatic compression back to the starting point. All this is done quasistatically. Such a 
reverse cycle acts as a refrigerator or a heat pump. An amount of heat Ag' is taken from 
the cold reservoir and AQ' b is delivered to the hot reservoir, per cycle, and this requires 
an input of work AW' = Ag' h — Age- The flows of energy are illustrated in Figure 2.6. 

Now imagine that we use an ideal gas Carnot engine to drive a Carnot heat pump 
employing the arbitrary working substance. We design the cycles for our engine and 
pump to make sure that the output work per cycle of one matches the input work per 
cycle of the other: AW = AW'. From Figure 2.6 we can deduce that 

Ag h — Ag c = AQ' h — Ag'. (2.35) 

Taken together, these two systems quasistatically take heat Ag h — Ag' h from the hot 
reservoir, per cycle, and deliver an equal amount Ag c — AQ' C to the cold reservoir. 
The operation of the machine is reversible, in the sense that we could have the ideal 
gas Carnot cycle operating in a forward direction (as an engine) and the arbitrary 



Figure 2.6 Two Carnot engines operating between a hot and cold reservoir, one acting forwards 
on an ideal gas, and the other in reverse on an arbitrary substance. 
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substance Camot cycle operating backwards (as a heat pump), or vice versa. Now, Carnot 
regarded it as inadmissible that a machine could be designed, even in principle, that could 
pump heat from a cold to a hot reservoir without some external input of work. But if 
A Q h ^ A Q' h , then in one of the directions of operation this is precisely what we have 
got. The only conclusion is that A Q h = A Q’ h and the two cycles cancel each other out. 

Now let us consider the Clausius integral for the Carnot cycle taken by the ideal gas: 





= 0 = 


T b 


A Q c 


and the arbitrary working substance: 

I(dQ\ AQ’ h A Q' c _ A Q h AQ C _ 

J\Tj q T h T c T h T c 


(2.36) 


(2.37) 


using (2.36). Therefore we can write 

= dS, (2.38) 

9 

for an arbitrary substance undergoing a portion of a Carnot cycle, where dS is an 
increment in a state variable S. If the arbitrary substance were an ideal gas, this would 
be the entropy quantity explored in earlier sections. Entropy is therefore a general prop¬ 
erty of matter defined through the Clausius integral. We deduced this in spite of not 
knowing the shape of the isotherms and adiabats on the p — V plot for the arbitrary sub¬ 
stance. The final step is to notice that any quasistatic cyclic process on the p — V plot 
can be represented as a modified sequence of overlapping Carnot cycles, as illustrated 
in Figure 2.7; so the path taken can be quite general, allowing us to evaluate entropy 
differences between any two equilibrium states of the arbitrary substance. 




V 

Figure 2.7 Paths ABCDGHA and DCEFD are two distinct Carnot cycles comprising sections 
along isotherms and adiabats of an arbitrary substance in the p — V plot. Path ABCEFDGHA is 
not a Carnot cycle: four heat baths are involved (there are four isotherms in the path). However, 
it is equivalent to the path ABCD(DCEFD)GHA: a Carnot cycle within another Carnot cycle. 
Clearly, any cyclic path can be broken down into elementary Camot cycles. 
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It follows that if any substance receives heat from a reservoir nonquasistatically, it 
contributes to the change in its entropy through 

d Q 

dS = ^f + dS,, (2.39) 

where we include the internal entropy production. Furthermore, for a cyclic process 
we find that the Clausius inequality (2.32) emerges just as before. Nonquasistatic heat 
transfer to a system from a reservoir creates entropy, to a degree defined in (2.39). All 
substances possess entropy and the capacity to generate it when thermally processed, as 
specified by the second law, just as we found explicitly was the case for the ideal gas. 

Finally, we assess the efficiency of an ideal gas heat engine that follows the path 
shown in Figure 2.5 but this time with the formerly isothermal segments taken at a 
nonquasistatic rate. We wait at the end of each for thermal equilibrium to be restored 
before embarking on the quasistatic adiabatic segments. We again deduce that the heat 
extracted from the hot reservoir A Q h is equal to the work done on the environment 
(A E = 0 still for the isothermal processing of an ideal gas), but now according to 
(2.39) this is equal to (ASj 2 - AS^)T h where ASj 2 = S(T h , V 2 ,N) - S(T h , V U N) = 
Nk ln(V 2 /V l ) from (2.22), and A Sj' is the entropy generated internally during the 
expansion of the gas in contact with the hot reservoir. Similarly, the heat transferred 
to the cold reservoir A Q c during the nonquasistatic compression is equal to the work 
done on the system by the environment and is equal to —(A S 34 — A S?)T C , where 
AS 34 = S (T c , V 4 ,N) — S (T c , V 3 ,N) — Nk In (V 4 /V 3 ) and A Sf denotes the entropy gen¬ 
erated internally during the compression of the gas in contact with the cold reservoir. 
As before, the efficiency is the total work per cycle divided by the heat extracted from 
the hot reservoir, so 
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(2.40) 


from (2.34), and we see that the entropy production causes the efficiency of the engine 
to fall below the Carnot efficiency. Turning this around, the failure to extract as much 
work from a heat flow as might theoretically be possible, or indeed any inefficiency 
of energy transformation, is due to entropy generation, giving the latter an additional 
intuitive meaning as a measure of the wastage of heat in the operation of heat engines. 


2.10 Entropy Change due to Nonquasistatic Work 

Our next example of entropy production concerns the nonquasistatic performance of 
work on an ideal gas without any heat transfer. Since A Q is zero, we expect to find that 
the overall change in entropy is entirely the A5) arising from internal generation, as was 
the case with free expansion. Consider a monatomic ideal classical gas in a thermally 
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insulated cylinder with a mismatch between its pressure and that of the environment. 
What happens if we release the piston? 

Equation (2.9) tells us that pV 5 ^ = constant characterises a quasistatic expansion or 
compression of the gas in the absence of heat exchange. This means that following the 
release of the piston at a point where the system has volume V 0 and pressure p {) , the 
piston will oscillate on a p — V diagram about the external pressure p { , along an adiabat 
as long as we assume the motion is quasistatic. If so, it would behave as same as an 
undamped spring and the system entropy would remain constant. 

However, the motion is not quasistatic. Intuitively we know that a real gas would not 
oscillate for ever: eventually the motion will cease when the bulk kinetic energy of the 
oscillation is converted into heat. We view this as a consequence of the finite viscosity 
of the gas. The final pressure when the piston comes to rest would be expected to be 
equal to the reservoir pressure p r . Since the cylinder is insulated, the heat generated by 
viscosity is not passed to the environment and the final equilibrium temperature would 
therefore be greater than the temperature of the system when at pressure p r on the 
adiabat during undamped oscillation. The raised temperature at the equilibrium pressure 
p r , together with the equation of state pV — NkT, means that the final volume Vj- of the 
gas will lie to the right of the adiabat. Hence the entropy of the final equilibrium state, 
proportional to In p r Vj ' , is greater than the initial entropy, proportional to In p {] V () ’ . 
The development on a p — V plot corresponding to this history is illustrated in Figure 
2.8. Entropy has been generated during the spontaneous equilibration in pressure between 
system and environment. 

We introduced viscous damping as a familiar process whereby motion in a mechanical 
system is naturally brought to a halt. It is often known as dissipation: the kinetic energy 
is dissipated as heat. The production of entropy that we have just deduced is somehow 
connected with the conversion of energy from what mechanical engineers would regard 
as a high quality coherent form (piston motion) into a low quality incoherent form 
(motion of the atoms). Since the system receives no heat transfer from the environment, 



Figure 2.8 Illustration on a p — V plot of the increase in entropy of a monatomic ideal classical 
gas associated with the removal of an initial mismatch between system pressure p and reservoir 
pressure p r . The kinetic energy possessed by the oscillation (that would persist for a truly ideal 
gas) is eventually converted to heat, and hence to a volume increase at the final pressure p t , if 
the gas is viscous. The final state therefore lies above the adiabat passing through the initial state. 
The final value of pV 5 ^ is greater than the initial value, and hence the entropy of the system has 
increased. 
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the entropy increase is from internal generation AS, alone. It is clearly appropriate 
to describe internal entropy production as a dissipative entropy change. In addition, 
the phenomenon of viscosity, or of friction in general, is related to the generation of 
entropy associated with flows driven by pressure gradients: another intuitive link, this 
time between entropy production and the wastage of mechanical energy. 


2.11 Fundamental Relation of Thermodynamics 


It is useful at this point to establish an important general connection between the entropy 
of a system and various other state variables. The result is often called th e fundamental 
relation of thermodynamics. We have seen that the entropy change can be related to 
a quasistatic heat transfer to a system: dS = (dQ/T) q and that the work done in a 
quasistatic mechanical process of volume change may be written (dW) q = —pdV. The 
argument we gave for the latter holds for any substance, not just an ideal gas, although 
there might be additional terms. Therefore, for a quasistatic process of heat transfer and 
volume change the increment in energy, according to the first law, can be written as 


d E = TdS -pdV. 


(2.41) 


If the energy change d E were brought about by a nonquasistatic process, then d Q for the 
process would not be equal to TdS and d W would not be equal to — pdV. Nevertheless, 
once equilibrium has been restored, (2.41) would still specify d E in terms of the incre¬ 
mental changes in equilibrium state variables dS and dV. It has to, because changes in 
the system variables can be brought about by a variety of thermodynamic processes, 
heating and squeezing in different sequences and at different rates, but once equilib¬ 
rium is restored, the details of the process are irrelevant as far as changes in E, T, S, p 
and V are concerned. Therefore, if (2.41) is true for a quasistatic process, it is true for 
all processes. 

The fundamental relation provides far reaching connections between state variables. 
It is often written as 

1 p 

dS = —dE + —dV, (2.42) 

and since we have an expression (2.21) for S(E, V , N) for the monatomic ideal classical 
gas, this relationship can easily be checked. Since 


S (E, V,N) = Nk In 



(2.43) 


we have dS = (3/2)NkdE/E + NkdV/V plus a term proportional to d/V that we shall 
return to shortly: as E — (3/2)NkT and pV = NkT this clearly matches (2.42). 

Quite generally, we know we can write an entropy increment, when considered as a 
function of E and V, in the form 
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which then implies that 


and 



(2.45) 


(2.46) 


We just showed that these are satisfied by the entropy function for the ideal gas, but 
they are quite general relationships between state variables. These results provide us 
with a means to define temperature and pressure if we happen to have at our disposal a 
mathematical expression for the entropy. 

But we neglected to explore the change in entropy brought about by a change in 
number of particles N. We expect to be able to write 
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and to this end we define the so-called chemical potential p: 
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such that the fundamental relation may be extended to 


(2.48) 


d E = TdS -pdV + ptIN. 


(2.49) 


Chemical potential might be unfamiliar, but it is just another state variable, an equilibrium 
property of matter. It turns out to have a rather special role to play, but first, to develop 
some intuition, let us derive an expression for the chemical potential of the ideal gas. 
We have 
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so 
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where e is the base of natural logarithms. We can make this expression more compact 
by introducing a temperature dependent length A th (T) defined by 
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such that we get, for an ideal gas, 



p(N, V, T) = kT In 


(2.53) 
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where we explicitly note that the chemical potential is a function of state variables N, V 
and T. The quantity A th is known as the thermal de Broglie wavelength, for reasons that 
will become apparent in Chapter 9. 

At constant particle density N/V, the temperature dependence of the chemical poten¬ 
tial of the gas is given by (3/r/3 T) N v = \l/T — 3k /2. In Chapter 9, we shall find that 
a criterion for classical, as opposed to quantum, behaviour is that the density of the 
gas should be much less than A th 3 , and from (2.53) this clearly means that n/T <0. 
Therefore, the chemical potential of the classical ideal gas decreases with temperature 
at constant density. On the other hand, the chemical potential in (2.53) clearly increases 
with density at constant temperature. This dependence on density and temperature is 
sketched in Figure 2.9. Notice that we cannot determine the absolute value of the chem¬ 
ical potential as we do not (yet) know the value of the constant c. Also note that we do 
not sketch the chemical potential for temperatures approaching absolute zero since it is 
there that we might expect quantum behaviour. 

Notice that the entropy of the ideal gas may be written in the form 



(2.54) 


where n = N/V is the particle density. The dependence of the chemical potential on 
temperature and gas density ties in with that of the entropy: as the density increases 
at constant temperature, for example by isothermal compression, so does its chemical 
potential, while its entropy decreases. 

Now we enlarge on the special role that chemical potential plays in thermodynamics. 
It is an indicator of equilibrium between isothermal systems that are able to exchange 
particles. In this respect it is analogous to the temperature, which according to the zeroth 
law is an indicator of equilibrium between systems able to exchange energy in the form 
of heat. Systems that can exchange heat and particles evolve so as to equalise their 
temperatures and chemical potentials. This behaviour is a consequence of the second 
law, as will be demonstrated in Section 3.2. We have already employed the assertion of 
the equalisation of temperatures after heat exchange in Section 2.7, without any special 
comment because it is so familiar. Particle exchange is less so, but it does make intuitive 
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Figure 2.9 Sketch of the chemical potential of a classical ideal gas as a function of temperature 
and particle density. Note that it is a negative quantity in the classical regime. 
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sense to claim that a flow of particles is driven by a difference in densities, and that the 
flow stops when density differences are eliminated. An example would be the diffusion 
of a trace gas into a large volume from a small canister once it is opened. Another 
example is the osmotic exchange of a solvent across a semipermeable membrane sepa¬ 
rating solutions with different solute concentrations. In view of the relationship between 
chemical potential and particle density in (2.53) for an ideal gas, differences in density 
are synonymous with differences in /x, if the systems in question are isothermal, and the 
elimination of density differences is equivalent to the equalisation of chemical poten¬ 
tials. Taking the analogy with heat flow further, we might suspect that the nonquasistatic 
flow of particles between systems driven by a chemical potential difference will lead to 
entropy production. Let us examine this more closely. 


2.12 Entropy Change due to Nonquasistatic Particle Transfer 


In the following example we show that nonquasistatic particle flows generate entropy. 
We consider a system able to exchange particles and energy with an environment, or 
reservoir, with chemical potential /x r and temperature T r . Such a reservoir can be called 
a particle bath as well as a heat bath. It is supposed to be so large that its chemical 
potential is not changed if it supplies particles to a system, just as a heat bath can supply 
heat without changing its temperature. Neither will it suffer any internal generation of 
entropy during a process. We suppose that the gas remains isothermal with the reservoir 
during exchange, namely that its temperature T is equal to 7j. throughout, and that the 
system volume is fixed. 

As a preliminary, we recast the extended fundamental relation as 

dS = ^d E + j;dV - ^d(V, (2.55) 


but now it is our intention to identify the dependence of the entropy change on an 
increment in the temperature instead of energy, in addition to increments in volume and 
particle number. We insert 
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and obtain 
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using the definition of constant volume heat capacity C v from (2.11). For a system 
consisting of a monatomic ideal classical gas, receiving particles from a reservoir while 
its temperature and volume are held constant, this reduces to 

dS = -^(/x-^cW. (2.58) 


An integral of such increments would be an analogue of the Clausius definition of 
entropy change (2.23), but this time corresponding to a quasistatic particle transfer. 
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driven by a time-dependent chemical potential /i r (t) of the reservoir, with the rate of 
change slow enough such that the chemical potential of the system /j. remains equal to /r r 
throughout. 

If we wish to consider nonquasistatic particle exchange between system and reservoir, 
(2.58) will need to be revised along the lines of (2.28). We write 

d.S- = - X - (V r - \kr\ dN + dS ; (2.59) 

as a definition of the internal entropy change dS, associated with a nonquasistatic process 
of isothermal particle transfer to a system from a particle bath, with the system starting 
and ending in equilibrium. We use the reservoir chemical potential as a reference point 
in the construction of (2.59), just as we used 7j. in the construction of (2.28). 

We consider a heat and particle bath at fixed temperature T r and chemical potential 
/i r The ideal gas system has an initial temperature equal to 7j. and an initial chemical 
potential Mo fi- fi r that depends on temperature and the initial particle density N 0 /V, 
according to (2.53). The system is coupled to the heat and particle bath and we assert 
that after a while a new equilibrium is established with the system chemical potential 
becoming equal to /i r This is illustrated in Figure 2.10. The change in system entropy 
is then an extension to (2.59): 


AS = -jU* ~ \kT r ) AN + AS,-, 


(2.60) 


where AN = N ] — /V 0 is the change in number of particles in the system and AS, is the 
entropy generated internally. From (2.54) the change in system entropy is 
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Figure 2.10 Entropy generation AS,- brought about by nonquasistatic particle exchange between 
an ideal gas system and a particle bath. 
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and we rearrange (2.60) and use (2.53) to get 




(2.62) 


The entropy of the reservoir changes by A S r = — [/x r — (3/2)kT r ]AN r /T r arising from 
the change A N r = — AN in its number of particles, and this is equal and opposite to the 
first term in (2.60) for the system entropy change AS. By definition, the reservoir does not 
suffer any internal generation of entropy. The change in the combined entropy of system 
and environment is therefore AS, , and from (2.62), which is sketched as a function of 
N l /N 0 in Figure 2.10, we conclude that irrespective of whether particles are flowing into 
or out of the ideal gas system, the internal entropy change AS, is never negative, in line 
with our consideration of nonquasistatic heat flow in Section 2.7 and Figure 2.4. The 
behaviour of AS, as a function of initial mismatch in relevant properties is very similar 
in the two cases. 

Once again, if the mismatch in chemical potential is small, such that the change in 
particle content of the system during the process SN = /V, — /V (l is small, then expansion 
of the logarithm in (2.62) allows us to show that <5S, oc (SN) 2 , and for a sequence of 
equilibrations with a set of particle baths each with a slightly different chemical potential, 
and each leading to a particle transfer SN, the overall entropy change is A,S' ( ~ (SN) 2 , 

a sum of second order small quantities. In the limit of quasistatic particle exchange, the 
overall entropy production goes to zero. 

2.13 Entropy Change due to Nonquasistatic Volume 
Exchange 

Finally, we consider entropy generation associated with the equalisation of pressure 
between a system and its environment. We looked at a similar situation in Section 2.10. 
For simplicity, we allow this to take place under isothermal conditions T — T r , and 
without the exchange of particles. The appropriate term in (2.57) for the change in 
system entropy is 



(2.63) 


and for an ideal gas E = (3/2 )NkT, so (dE/dV) TN = 0. For a quasistatic process, the 
volume changes so slowly that the pressure adjusts to the time-dependent driving pressure 
p T (t) of the environment; so AS = f (p/T)dV is the analogue of the Clausius integral 
for this situation. For nonquasistatic driving, we define an internal entropy production 
through 


dS = —dV + dS), 

T 1 ’ 


(2.64) 
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for an incremental process, and A S = (p r /T r )AV + AS, for a finite change. Bearing in 
mind that T and N are constant, we can use (2.22) to show that AS = Nk \n{Vy/V 0 ) for 
a nonquasistatic change in gas volume from V 0 to V), so 


AS t — Nk In 



- fr(Vt - V 0 ) 



(Vi ~ Vo) 

Vi 


(2.65) 


since the final system pressure p T is equal to NkT r /V ] . This constitutes the total entropy 
production of ideal gas and environment for the conditions of the nonquasistatic process, 
and as it takes a form similar to (2.30) and (2.62) it is never negative, irrespective of 
the ratio of volumes Vq/V 1 , as we now have come to expect. 


2.14 General Thermodynamic Driving 

We have seen that thermodynamic processes driven by separate mismatches in 
temperature, pressure and chemical potential between an ideal gas and its environment, 
leading to outcomes that correspond to our intuitive expectations, are accompanied by 
an increase in the total entropy. For a thermodynamic process that involves incremental 
changes in all three driving parameters, the dissipative or internal contribution dS) to 
the change in the entropy of the system may be written as 

d E p. ii r 

dS,- = dS - —dV + —dN, (2.66) 

T r I, l\ 

where 7), p { and /i r are properties of the environment, and might be functions of time. 
Notice that we have employed the expression involving the increment d E rather than 
the form involving d T in (2.56), for convenience of notation. We assert that dS, in 
(2.66) is never negative, not only for an ideal gas, but for any choice of thermodynamic 
system and environment. This is a statement of the second law. 

This can be made apparent in the following way. An environment characterised by 
time-dependent parameters 7 n p, and /x r drives a system such that its state variables E, 
V and N also acquire a time dependence. If the system is only mildly perturbed from 
equilibrium during such a nonquasistatic driving process, it is reasonable to characterise 
it using spatially uniform time-dependent state variables T, p and p as well. Intuition 
suggests that the latter typically lag behind the environmental variables that drive them, 
as sketched in Figure 2.11. We shall return to this viewpoint in Section 15.1. The system 
would also be characterised by a time-dependent entropy S(t) related in the standard way 
to the time-dependent state variables. Combining (2.66) with (2.49) we can write 

d5j/_L _1 \d£ (p(f) Pr (.t)\dV fn(f) li r (t)\dN 

d t \T(t) T r {t)) dr \T(t) T r (t)J dt \T(t) T r (t)) dt ’ 

and the second law asserts that the sum of the three terms on the right hand side will 
be positive. Notice that the contributions take the form of a mismatch in intensive state 
variables of system and environment, such as 1/7) multiplied by an increment in an 
extensive state variable of the system, such as d E. 

Intuitively this representation makes sense. For example, if T < 7) we know from 
experience that heat flows into the system, in which case dE > 0 (for constant N and V ) 












32 Statistical Physics: An Entropic Approach 



Figure 2.11 Sketch of the way intensive system properties are driven by, but lag behind, time-de- 
pendent reservoir properties, as a consequence of nonquasistatic energy, volume and particle 
exchanges. Such a picture holds only if the rates of change are not too fast, such that thermodynamic 
properties such as temperature are approximately valid out of equilibrium. 


and the first term on the right hand side of the above equation is positive. Conversely, if 
T > 7’ r at some point during the process, we know that heat will flow out of the system, 
such that dE < 0 and the term is still positive. Similarly if the system pressure (over 
T) is greater than/less than the reservoir pressure (over 7’ r ), we know from experience 
that the natural change dV in system volume is positive/negative. Thus positivity of 
the second contribution to the right hand side of (2.67) seems to emerge. Finally, if the 
system chemical potential (over 7) is less than or greater than the ratio p r /T r , which 
suggests a difference in particle density between system and reservoir, we expect d N to 
be positive/negative. 

The second law makes the claim that in all empirical situations the sum of the 
contributions to d5, is never negative. A powerful rephrasing is that energy, volume 
and particles flow as they do, when constraints are changed, because the total entropy 
must increase. The second law hence seems to offer a rationale for the natural evolution 
of any thermodynamic process. 


2.15 Reversible and Irreversible Processes 

We continue to consider the evolution of a system driven by a time-dependent 
environment. The only way for the total entropy not to increase is for the system 
variables to match the driving environmental variables exactly. There should be no 
time lag between them. This intuitively requires an extremely slow rate of change: 
a quasistatic process. The terms on the right hand side in (2.67) then turn out to be 
second order in the small mismatch between the system and its environment, as we saw 
earlier, and therefore negligible even when summed over the duration of the process. 

An idealised process of this kind has a very special character. Imagine going forward 
through some quasistatic process driven by a time evolution in environmental variables. 
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If the evolution of the environmental variables were then reversed, the system variables 
would follow them, without any time lag, and once again no overall entropy would be 
produced. Eventually, we could recover the initial state of both system and environment, 
but only if we perform the forward and backward processes very slowly and hence with 
no overall increase in entropy. 

This sort of entropy conserving thermodynamic process is called reversible. We 
regard it as synonymous with the word quasistatic, although in principle there might be 
dynamical systems that undergo discontinuous changes in properties even when driven 
quasistatically, and this might affect the way in which equilibrium between system 
and environment can be maintained throughout the process. A reversible process is an 
idealisation of a real process: no process can actually develop so slowly that no entropy 
is generated, but it is an idealisation that is very useful in thermodynamics. In contrast, 
all real processes are characterised by entropy production. The entropy generated going 
forward cannot be effaced by a reversal of the external driving forces in an attempt to 
restore the initial state: in recognition of this they are called irreversible processes. 

The reversibility of a Carnot engine or pump was invoked in Section 2.9 to support 
the assertion that the efficiency of such systems does not depend on the working sub¬ 
stance, although the word was used in a general sense. We see now that thermodynamic 
reversibility has a technical sense associated with zero entropy production. If an engine 
were driven at a finite rate it would not perform a Carnot cycle and it would not be 
reversible. It would generate entropy going forward, and if we were to drive it backwards 
as a pump, it would also generate entropy. It would be an irreversible cyclic process 
and (2.40) demonstrates that it would be less efficient than a reversible cycle. 


2.16 Statements of the Second Law 

This chapter began with a listing of the four laws of thermodynamics, but most of 
the discussion has concerned the second law. This is because it involves the topic of 
entropy, about which more can be said than any other concept in thermodynamics. It is a 
law that concerns nonquasistatic thermodynamic processes. Traditionally, presentations 
of statistical physics have focussed largely on the equilibrium properties of matter, and 
how they change in a quasistatic process. From such a perspective, the second law seems 
strangely peripheral: here it has taken centre stage. 

The second law states that all spontaneous thermodynamic processes, initiated by the 
removal of a constraint, increase the entropy of the universe. In fact this is a slightly 
incomplete statement: as entropy is a property of a system in equilibrium, the idea to be 
conveyed is that if the universe, after the constraint is removed, could ever be brought 
back into equilibrium, it would have an entropy that is larger than it was before the 
process started. Quasistatic processes are engineered to leave the entropy unchanged by 
taking an infinite amount of time to complete: these are thermodynamically reversible, but 
clearly not realistic. The second law forbids certain directions, or better put destinations, 
of a spontaneous process; of all possible end-points. Nature excludes those where the 
total entropy of all the relevant components is lower than at the start. The second law 
may therefore be expressed as a number of statements of the form ‘this does not happen’. 
They normally correspond to familiar experience. 
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• Clausius statement: Heat cannot flow spontaneously from a system at a low 
temperature to a system at a higher temperature; 

• Kelvin statement: It is impossible in a cyclic process to convert heat completely into 
mechanical work (i.e. no engine can have an efficiency of 100%). 

If the Kelvin statement is violated, it can be shown by use of a Carnot engine that the 
Clausius statement is violated too. 

Other statements in the same spirit include: 

• Matter cannot flow spontaneously from a system with a low chemical potential to a 
system at the same temperature and a higher chemical potential; 

• A system at a low pressure will not expand at the expense of a system at a high 
pressure; 

• A gas will not spontaneously contract to occupy a volume smaller than its container; 

• Energy is reduced in ‘quality’ by natural processes (bulk potential energy is high 
quality energy, heat is low quality). This is equivalent to saying ‘energy is dissipated’. 

We have shown that these phenomena all appear to be manifestations of the second law. 
But we have not proved that the second law should hold: this is beyond the remit of 
classical thermodynamics and why it is one of the axiomatic laws from which thermo¬ 
dynamic behaviour can be deduced. However, a proof of sorts will be presented using 
ideas of random dynamics in Chapter 17. 

A corollary to these ideas, to be developed further in Chapter 3, is that after a process 
is initiated, Nature chooses one particular destination over all others, namely the available 
end-point of the process with the greatest total entropy. The argument for this goes as 
follows. Imagine some other possible final state of the system, with lower total entropy. 
It can be established as an equilibrium state of the system, through the application of a 
suitable constraint. For example, in a free expansion of a gas from a volume V 0 into a 
volume V[, as in Section 2.6, the evolution could be halted at an intermediate volume 
V m if an enclosing container with this volume were inserted. The entropy change in 
such an arrested expansion would be Nk \n(V m /V 0 ), lower than the entropy change in 
the expected final state Nk In (V t /V {] ). But the point is that such intermediate states can 
only be maintained against evolution towards the state with the highest entropy by the 
existence of the necessary constraints. If the constraints were removed, we would expect 
to see the system continue to seek a state with higher entropy, in accordance with the 
second law. The conclusion is that evolution can only cease when the entropy is at its 
highest value consistent with the remaining constraints. Otherwise, we would require 
constraints that do not exist, such as a container with a volume less than V l . 

So when a constraint is removed, the universe evolves spontaneously to a new 
equilibrium state that maximises its overall entropy. The universe, in the words of 
Clausius, strives to maximise its entropy through the variation of any unconstrained 
state variables, in other words by rearranging its constituents. The second law seems to 
be a so-called variational principle, a requirement to maximise or minimise something: 
it is one of several such principles to be found in physics, a property that makes it 
rather beautiful, but still somewhat mysterious. 
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2.17 Classical Thermodynamics: the Salient Points 

It is worth summarising the main features of classical thermodynamics. 

• Thermodynamics is fundamentally a theory of the transfer of heat and matter between 
macroscopic physical systems, and of the equilibrium states to which such transfers 
lead, which are defined by time-independent properties and the absence of fluxes of 
energy or particles within them. 

• We often imagine a system to interact with an environment that has idealised properties 
based on its presumed large size. The environment is also referred to as a reservoir, 
or a heat, volume or particle bath. 

• Macroscopic systems are described by state variables. Some are well defined whether 
the system is in equilibrium or in a state of evolution: examples are energy, volume and 
the number of particles. Others are properly defined only in equilibrium, such as tem¬ 
perature, pressure, chemical potential and entropy, although to a rough approximation 
they can be applied to systems that are mildly perturbed from equilibrium. 

• Thermodynamics provides relationships between state variables, such as an equilib¬ 
rium equation of state or a specification of an increment in one variable in terms of 
increments in others, for example the fundamental relation of thermodynamics. 

• The equilibrium state of an isolated system has the maximum value of entropy of all 
possible macroscopic arrangements of the system, subject to any imposed constraints 
such as an enclosing volume, or a fixed energy and particle number. 

• Removal of constraints typically leads to the evolution of a system and its environment. 
If such a process takes place at a finite rate, or nonquasistatically, the new equilibrium 
that is established between them will always have a larger total entropy. In a sense, 
we can consider that the process of evolution is accompanied by a rate of production 
of entropy. 


Exercises 

2.1 Estimate the entropy generated by a typical physics professor in one day. 

2.2 (a) A piston is used to compress an ideal gas quasistatically from volume V, to 
volume Vj . If the gas is in thermal contact with a heat bath at temperature T, such 
that the compression is carried out isothermally, calculate the work done on the gas, 
the change in entropy of the gas and the change in entropy of the heat bath, (b) The 
compression is repeated but nonquasistatically. Are the three calculated quantities 
higher, lower, or the same as before? (c) The procedure is repeated but without 
contact with the heat bath such that the compression is adiabatic and quasistatic. 
The initial temperature is T. Again, what values do the three quantities take? What 
is the final temperature? (d) Now assume the compression is adiabatic but nonqua¬ 
sistatic. Are the three quantities different from those in (c)? Is the final temperature 
greater than, less than or the same as the final temperature in (c)? 

2.3 A 1 kg block of steel at temperature 60° C is placed at the bottom of a lake of depth 
10m at temperature 10° C. The specific heat capacity of steel is 420 JKT 1 kg -1 and is 
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approximately temperature independent. Calculate the entropy change of the block, 
the lake and the universe, after thermal equilibrium has been reached. A crane is 
used to lift the block extremely slowly from the bottom of the lake to the surface. 
What is the entropy change of the universe? Then it is dropped back in. Again 
comment on the entropy change of the universe. 

2.4 Consider two systems each containing an ideal gas of N particles in a volume 
V but with different temperatures 7\ and T 2 . The volume of each system is kept 
constant while thermal contact is made between them. After equilibrium has been 
established, demonstrate that the total entropy of the universe has changed by AS = 
3Nk\n((T l + T 2 )/(2(T l T 2 ) 1 ^ 2 )) and that this is never negative. 


3 

Applications of Classical 
Thermodynamics 


We discuss some applications of the ideas developed in Chapter 2. We focus on properties 
of entropy, on the criteria for equilibrium and how they arise from the second law. 


3.1 Fluid Flow and Throttling Processes 

Thermodynamics can be applied to materials in motion, such as fluid flow, as long as 
we employ local state variables such as pressure and temperature on the understanding 
that they are only approximate descriptors for systems that are out of equilibrium. It is 
beyond the scope of this book, but we could develop models for the transport of heat 
by convection and conduction, or for the evolution of the motion of a fluid caused by 
gradients in the pressure. 

We shall discuss only one fluid flow problem, partly because it is simple, and partly 
because it involves entropy production, an inevitable feature if we have systems away 
from equilibrium. It is known variously as plug flow, throttling, or the Joule-Thomson 
process and it plays a role in the industrial cooling of gases. 

We consider a fluid flowing uniformly along a pipe, such that from the point of view of 
an observer moving at the same speed in the direction of motion, it might be considered 
to be in equilibrium and to possess a uniform pressure. Real fluids have viscosity and 
could only be kept in such a state of motion by imposing a pressure gradient, but we 
ignore this. There is a plug of porous material occupying a section of the pipe that does 
offer resistance to the flow. In order to force the fluid through the plug, a piston or 
similar device has to perform work on the fluid upstream of the plug. The flow passes 
through the plug in a complicated manner, but emerges into the clear section of pipe 
and eventually settles back into a steady uniform flow, and pushes a downstream piston. 
There are a number of contradictory features here: the fluid is assumed to be without 
viscosity, and yet eventually loses the turbulent pattern of flow brought about by the 
plug. We idealise the behaviour in order to say something about entropy generation. 
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Figure 3.1 Fluid flow along a pipe and through a resistive plug, giving rise to density, temperature 
and pressure changes, enthalpy conservation and entropy generation. 


The walls of the pipe are thermally insulated, so the balance of energy input and output 
for a system consisting of a translating tube of fluid passing through the plug consists 
of work done on the fluid, work performed by it and energy change arising from the 
difference in temperature between the upstream and downstream flows. Referring to 
Figure 3.1, we denote the inlet and outlet pressure and temperature as p lo and 7’ 0 , 
respectively. Assuming that the work processes are quasistatic, we write the balance of 
overall work done on the fluid as AW = p i AV i — p 0 AV 0 , where A V l and AV,, are the 
(positive) volumes vacated at the inlet and occupied at the outlet by the tube of fluid in 
a particular period of time as a result of the flow. The change in energy of the tube is 
e(n 0 , T 0 )AV 0 — <?(«,, T l ) A V 7 ,, where e is the energy per unit volume of the fluid, written as 
a function of particle density n and temperature. Equating these energy changes, we write 

Pi AVi - Po AV 0 = e{n 0 , T 0 )AV 0 - e(n x , T t ) AV v (3.1) 

or 

(e(n v T t ) + Pi )AV i = (e(n 0 , T 0 ) + Po )AV 0 . (3.2) 

This is a conservation law. Essentially, a fluid packet of initial volume AV t passes down 
the pipe, changing its volume to AE 0 well downstream of the plug, and altering its 
pressure and temperature too, but maintaining its enthalpy, a state variable defined by 


H =E + P V. 


(3.3) 


Typically, the plug produces a pressure drop in the flow (p t > p Q ), but, depending on 
the nature of the gas, it can experience either an increase or a decrease in temperature. 
We shall not discuss this rich range of behaviour, but instead focus on an ideal gas to 
investigate the entropy generation. Plug flow is analogous to a steady state version of 
free expansion, and we expect internal entropy production. We have E = (3 /2)NkT and 
pV = NkT: so H = (5/2 )NkT = (5/2 )pV . The drop in pressure therefore gives rise to 
an increase in volume of a packet of fluid after its passage through the plug. The entropy 
change of a packet containing N particles is given by 


AS — Nk In 




> 0 , 


(3.4) 


using (2.19) together with conservation of enthalpy \p 0 V 0 — | p{V v and this is entirely 
due to internal generation, since the heat flow into the fluid tube is zero. 
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In general, we can show that the second law requires there to be a pressure drop by 
writing d H = dE + pdV + Vdp and inserting the fundamental relation (2.49) to obtain 
dll = TdS + pdN + V dp such that 



1 u V 

dS = -d H - -dN - dp 

rj~i rj~i rj~i i 


(3.5) 


meaning that a drop in pressure of a packet of fluid at constant enthalpy and particle 
number is associated with an increase in system entropy. In other words, a packet that 
passes through the plug emerges with the same values of N and //, and will have increased 
its entropy as a consequence of the irreversible processes brought about by the resistance 
to the flow, and (3.5) tells us that its pressure will have decreased. 

3.2 Thermodynamic Potentials and Availability 

According to the variational statement of the second law of thermodynamics given in 
Section 2.16, an isolated system moves towards equilibrium by rearranging itself to 
maximise its entropy. Free expansion of a gas is a good example. This must be modified; 
however, if the system can exchange energy, volume or particles with its environment. 
The entropy of the universe is maximised when equilibrium is found, but the change 
in the entropy of the system can be negative if the system temperature goes down, for 
example. Is there a different variational principle we can use? 

We must apply the second law to both the system and the environment in order to 
find a principle that indicates how system variables should change. We consider a system 
in equilibrium characterised by a set of thermodynamic variables, and then place it in 
contact with an environment to allow them both to evolve towards a new equilibrium 
state. We require 


AS + AS r > 0, 


(3.6) 


to a maximal extent, where AS r is the entropy change of the environment. From the 
fundamental relation (2.49), we deduce that T r AS r = A E r + p r AV r — /x r A N r because the 
reservoir is at a constant temperature T t , pressure p r and chemical potential p r throughout 
the process. By conservation of energy, volume and particle number, the changes in 
system variables are given by A E = — A E r , AV — — AV r , and AN = — AN r , so 


T r AS + T r AS r = T r AS - A E - p r AV + p r AN > 0. 


(3.7) 


This may be summarised in the statement 


AA — AE + p r AV - T t AS - p r AN < 0, 


(3.8) 


to a maximal extent, where we define the availability A as 


A — E + p r V — T r S — p r N. 


(3.9) 


As the initial availability is fixed, the new equilibrium state of the system is characterised 
by changes that minimise the availability of the final state. 
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Note that this variational principle involves a quantity A that is a function of extensive 
system variables E, V, S and N, and intensive environmental variables p r , T r and /i r . 
The intensive system variables p, T and p do not appear. The minimisation is over 
extensive system variables, and any internal partitioning of these quantities between 
different constituents of the system. 

However, these variables might be constrained in various ways, depending on the 
nature of the coupling between system and environment. If the system were isolated, 
such that N, E and V remain constant during all conceivable spontaneous processes, then 
A — — T r S + constants and the minimisation of availability would correspond to the 
maximisation of system entropy S(E,N ,V) at constant N, E and V. This maximisation 
is over all internal arrangements of the constituents, for example all possible density 
profiles across the system volume. If the initial profile were nonuniform, we would 
expect the system to evolve to a uniform profile to maximise its entropy, just as we saw 
in the case of a free expansion of an ideal gas. 

3.2.1 Helmholtz Free Energy 

Now consider a new situation where the system volume V and particle content N are held 
constant during the process, but heat transfers are possible such that E might change. The 
p r V and p r N terms in the availability are constant and can be ignored; so the availability 
is A = E — T r S + constants. Let us see what the minimisation of A would mean for an 
ideal gas placed in contact with a heat bath. Using (2.21), the reduced availability (i.e. 
ignoring constants) is 


A r (E) = E - T t S = E - M7j. In 



(3.10) 


The state variable E is unconstrained and its value in the final equilibrium is to be 
selected variationally. By setting to zero the derivative of this expression with respect 
to E, we find that 


3 JVH r 
2 E 


= 0 , 


(3.11) 


implying that the final system energy is given by 3NkTJ2, and that the system temper¬ 
ature evolves to T r . More generally, we would write 


d E 


= 1 




(3.12) 


using the relationship (2.45) that gives the same result. We expected this, and indeed 
assumed that it would happen when considering heat transfer and entropy production in 
Section 2.7. We now see it as a consequence of the second law. 

The form taken by the reduced availability in the final state is E(T r ,N ,V) — 
T r S (E(T r ),N ,V). This combination of state variables has a special name - the 
Helmholtz free energy, and is an example of a so-called thermodynamic potential : 


F(N,V,T ) =E - TS. 


( 3 . 13 ) 
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The minimised reduced availability for the example of an ideal gas is written, using (2.6) 
and (2.22), as 


A R (E(T r )) = F ls (N,V,T r ) 


3 

-NkT, 

2 


1 - In 


kT r 


0 cN/V ) 2 / 3 


(3.14) 


where the Helmholtz free energy is labelled to be that of an ideal gas (ig). 

Note that the Helmholtz free energy has been cast here as a function of the state 
variables N, V and T. The energy and the entropy can both be expressed as functions of 
these variables. These are regarded as the ‘natural’ variables for F, for reasons we shall 
understand better at the end of this section. 

The selection of equilibrium of a system under constraints of fixed N, V and T is 
sometimes portrayed simply as the minimisation of F = E — TS , which carries the sug¬ 
gestion that both S and T in this expression are allowed to vary as E is changed. It is 
clearly more accurate to state that we are to minimise the reduced availability E — T r S 
over the system energy E as well as any parameters involving the internal arrangement 
of the constituents of the system. But at some point in this minimisation procedure, the 
reduced availability can often take the apparent form of a Helmholtz free energy or a 
sum of such terms. 

For example, consider a gas initially constrained to have a nonuniform density profile 
across a fixed volume. We suppose that in the initial arrangement, A^ 0 particles are 
confined to a subvolume of the system V ), and N® = N — N j° particles reside in the 
remaining volume V 2 = V — V 1 , with both parts at arbitrary temperatures, as illustrated 
in the first situation in Figure 3.2. When the partition between the subvolumes is removed, 
and when we put the system in contact with a heat bath, we expect it to evolve to 
equalise the particle densities in the subvolumes, and to bring both subvolumes to the 


T 


T r 


T< 


it 



e : 


T t \ 


IV, 0 



IV, 0 


\ N i 


T 2 



i T, \ 


• T 

r 


1V 2 ° 



£ 


| n 2 







Figure 3.2 Two stages of availability minimisation to seek a new equilibrium. First, thermal 
contact is opened between two subvolumes of a system and the environment, bringing both of them 
to the temperature T r . Secondly, particle exchange is opened up between the subvolumes, which 
leads to an equalisation of their chemical potentials, and for an ideal gas, an equality between 
particle densities N 1 /V l and N 2 /V 2 . The second stage can be regarded as the minimisation of 
the Helmholtz free energy of the system over the degree of partition of particles between the 
subvolumes. 
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temperature of the heat bath. In order to determine the final equilibrium, we first minimise 
the availability with respect to the energies £j and E 2 in each subvolume, without 
changing the particle content in each, which, as we saw above, has the effect of requiring 
the temperature in each subvolume to equalise with that of the heat bath. This part is 
represented by the transition from the first to the second situation in Figure 3.2. The 
reduced availability is then a sum of Helmholtz free energies for the two subvolumes. 
A further minimisation of the availability is then done with respect to the number of 
particles Aj in volume Vj . Using the free energy just derived for a monatomic ideal 
classical gas, we write 
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and we select the equilibrium value of Aj by setting the derivative cL4 R /dAj equal to 
zero: 
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which implies that in the equilibrium state, shown as the third situation in Figure 3.2, 
the particles are partitioned between the two subvolumes according to 


Aj _ N - Aj _ N 2 

n ~~ y-Vi ~ vS 


(3.17) 


or in other words, with equal densities, as expected. The minimised availability is then 
given by F ig (Aj, , 7j) + F lg (N 2 , V 2 , T r ) = F ig (N ,V ,T r ). 

Notice that the chemical potentials of the gas in each subvolume, given according 
to (2.53) by /x = kT In (/,^/V /V), are equal when equilibrium has been established, 
a principle that was assumed when considering particle flow and entropy generation in 
Section 2.10. This applies quite generally for substances other than ideal gases. When we 
minimise A R (N { ) = F l (N l , Vj, Tj.) + F 2 (N 2 , V 2 , 7j.) to determine the partition of particles 
between subvolumes 1 and 2 for an arbitrary substance we write 


d/V, 


/3f|\ dN, 

\ 9N| / V] t t \ 9N 2 / v 2 j, dN | 


(3.18) 


To proceed further we differentiate (3.13): 


dF = dE - TdS -SdT, 


(3.19) 


and use the fundamental relation dE = TdS — pdV + /xd/V to give 


dF = -pdV + pdN - SdT , 


(3.20) 

























and then construct the partial derivative 

/dF 

la n 
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(3.21) 


in which case (3.18) reduces to Pi(N 1 , V { , T r ) = /x 2 (2V 2 , V 2 , T t ) having inserted 
d7V 2 /d/V| = — 1 since N { + N 2 — N, and we conclude that equilibrium requires equality 
of chemical potentials. 

Notice in (3.20) that an increment in F is given in terms of increments in N, V and 
T. We could of course relate the latter to increments in other state variables, but the 
expression would be more complicated. This is one reason why F is considered to be a 
natural function of variables N, V and T. 


3.2.2 Why Free Energy? 

The terminology of free energy is a little archaic and comes from considering the perfor¬ 
mance of quasistatic mechanical work on a system constrained throughout the process 
to be isothermal with respect to an environment at a temperature T. The work done 
is AW = A E — A Q, and the quasistatic heat transfer from the environment that arises 
as a consequence of the process is A Q — TAS, so AW = A(E — TS) = A F as T 
is a constant. For nonquasistatic work, A Q = T(AS — A,S',) in which case AW = 
A E - TAS + TAS, = A(E - TS) + T AS, = A F + TAS,, and therefore 

AW — AF = TASj > 0, (3.22) 

since A5,- > 0. The interpretation is that the work done is partly converted into a change 
in the free energy of the system and partly wasted through a positive heat transfer to 
the environment, of magnitude T A .S’,, that essentially corresponds to friction. As long 
as the work is done quasistatically, all the mechanical work performed would be stored 
in the system as a change in the state variable F. 

Similarly, we could make the system perform work on some external body by way of 
a thermodynamic process, while keeping the system isothermal with its environment. For 
example, a compressed gas could be expanded from V 0 to Vj so as to push a piston while 
being maintained at a constant temperature. If the process were quasistatic, the (positive) 
work done on the environment, —AW, would correspond to the negative of AF = 
F(N , F|, T) — F(N , V 0 , T ), the change in the system state variable F(N, V , T) over the 
process. If the process were nonquasistatic, then the work done on the environment 
would be given by —AW = — A F — TAS,, which would be lower than the maximum 
work achievable (—AF) for the corresponding quasistatic process. A process to extract 
work at a rapid rate from a system is less effective than one carried out more slowly. 
More of the stored free energy is wasted as frictional heat. The change in the Helmholtz 
free energy F is therefore a measure of the maximum amount of energy possessed by 
a system and its environment that can be ‘freed’ to perform mechanical work during a 
specified isothermal process. Hence the name has emerged. 

3.2.3 Contrast between Equilibria 

As an example of different forms of the principle of minimisation of thermodynamic 
availability for different environmental constraints, consider a system containing a pool of 
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water. At a certain instant, a membrane is removed from the surface, allowing the liquid 
to evaporate into the space above. A new equilibrium is reached when the water vapour 
pressure reaches the so-called saturated vapour pressure p e (T), a material property that is 
an increasing function of temperature, which we shall encounter again in Section 3.9.2. 

The internal parameter of the system that is varied in order to minimise availability 
after the membrane is removed is the degree of partitioning of water molecules between 
the gas and liquid phases. If the system were thermally isolated while it did this, the 
principle of maximising system entropy would determine the final equilibrium state. 
However, such a state would be cooler than the initial liquid, as latent heat is needed 
to evaporate the liquid, and with the overall energy of the system held constant, this 
leaves less in the form of molecular kinetic energy, and hence we get a lower final 
temperature. The final vapour pressure would be the saturated vapour pressure at such a 
reduced temperature. 

If, on the other hand, the system remained in thermal contact with a heat bath through¬ 
out the process, heat would flow into the system to keep the temperature the same. The 
environment would supply the latent heat of evaporation. In the final equilibrium state 
the vapour pressure would be the saturated vapour pressure at the original pool temper¬ 
ature and it would be higher than the final vapour pressure for the isolated case. This 
difference in outcome following the removal of the membrane is explained by noting that 
in the isothermal case, we select the final equilibrium state by (effectively) minimising 
the Helmholtz free energy of the system rather than by maximising its entropy. Both sit¬ 
uations correspond to the minimisation of the availability, but under different constraints. 


3.2.4 Gibbs Free Energy 


Now consider the availability for a condition of constant N, but with the system volume 
no longer constrained. Both the energy and volume of the system are allowed to vary 
after the release of an initial constraint and the relevant part of A is the combination 
A r (E, V) — E — T r S (E,V) + p T V. We impose 
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(3.24) 


where (2.45) and (2.46) have been used, and the expected equilibrium conditions p = 
p r and T = T { emerge. The equilibrium values of energy and volume are E (p r , T r ) 
and V T r ). The minimised reduced availability E T r ) — T r S (E (p n T r ), V (p r ,'/].)) + 
p r V T r ) is equal to the so-called Gibbs free energy G of the system evaluated under 
the prevailing conditions of reservoir pressure and temperature, where we define 


G(N,p, T) =E -TS +pV. 


(3.25) 


This is another example of a thermodynamic potential. By analogy with the previous 
discussion of Helmholtz free energy, if a system that is not open to particle transfers from 
the environment has freedom to partition itself internally while in contact with a heat and 
volume bath, then the choice of equilibrium state is made by minimising the availability. 
After minimisation with respect to system energy, which sets the temperature equal to 
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Figure 3.3 Minimisation of availability for a liquid-vapour system under conditions of constant 
pressure, volume and particle number. The equilibrium state is such that the particles are partitioned 
between liquid and vapour phases to minimise the sum of their Gibbs free energies, in this case 
when iV 1; = 21V vap . Note that G vap decreases as lV vap increases, consistent with (3.28) and p < 0. 


that of the environment, the reduced availability resembles a sum of Gibbs free energies 
characterising each internal subvolume of a system. The minimum of this function over 
the partitioning is the Gibbs free energy of the final equilibrium state. 

For example, consider again a system consisting of a pool of water in equilibrium with 
its vapour such that both have the same temperature and pressure. We expose the system 
to an environment at the same temperature T r but at a different pressure p r and allow the 
system to change its volume and any other parameters to seek out a new equilibrium. 
Clearly both the liquid and vapour need to assume the new pressure p r , but for the vapour 
to do so at constant temperature, there will need to be some evaporation or condensation 
of the liquid. The procedure to follow is to minimise G’ ljq (A' liq ,/; n T r ) + G vav (N mp ,p T , T r ) 
over the partitioning of particles between liquid and vapour, with N = A' liq + /V vap fixed. 
This is illustrated in Figure 3.3. We write 
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and proceed by differentiating (3.25) and inserting the fundamental relation: 


dG = dE - TdS - SdT + pdV + Vdp = -SdT + Vdp + pdN, (3.27) 


to show that 



(3.28) 


We again conclude that when particle exchange is possible between parts of a system, a 
condition for equilibrium is that each should acquire the same chemical potential. 

Incidentally, (3.27) demonstrates that G is a natural function of variables T, p and N. 
The convenience of this choice is also evident in the fact that T and p are constrained 
by the reservoir during the process while N is a constant for the system as a whole. 
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These considerations also characterise equilibrium between mixtures of reactive gases 
coupled to a heat and volume bath. If gas A can react with gas B to make gas C, while 
C can dissociate back into A and B, written as A + B ^ C, then the equilibrium volume 
of the mixture for a given pressure and temperature, and hence the balance between 
reactants and product, is determined by a minimisation of the total G over the extent 
of the partitioning between reactants and product. We can determine this balance in the 
following way. The chemical potential of species A is p A = kT In n A + const, where n A 
is the density of A, and similarly for species B and C. The minimisation of G requires 

p A d N a + /r B dlV B + Pc^c = 0’ (3.29) 

and as the reaction imposes relationships d/V A = d/V B and d/V c = —dN A this corre¬ 
sponds to p A + Pb — Pc — and hence kTln (n A n B /n c ) = const, or n c oc n A n B , a 
result known as the law of mass action. 


3.2.5 Grand Potential 

We could also consider a system placed in contact with a heat and particle bath such 
that the release of the initial constraint allows it to change its temperature and particle 
number at constant volume. The relevant part of the availability is then A R (E,N) = 
E — T r S ( E,N ) — p r N. The minimisation over E at constant N establishes that the sys¬ 
tem temperature T is equal to that of the environment T r , as before. The condition for 
minimisation with respect to N is 



and hence, as expected, the chemical potential of the system equalises with that of the 
environment. Having achieved this, any further internal partitioning of the system is 
determined by minimising a sum of terms of the form 

<S>(p,V,T) =E -TS - pN (3.31) 


for each subsystem. O is known as the grand potential and it is most naturally cast as 
a function of p and T, the constraints that are imposed on the system as it evolves, and 
V, the volume of each subsystem that is to be varied to seek the new equilibrium. 

For example, consider a system of volume V that can receive water vapour from a 
pool of water, with both in contact with a heat bath. The system is therefore coupled 
to an environment that can supply heat and particles, although, in contrast to the case 
examined in Section 3.2.4, the chemical potential of the source of particles is considered 
to be fixed. The reduced availability is minimised to yield the condition (3.30), and the 
equilibrium number of particles in the system is given by 
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W) 


exp 


Pi 

kT r 


(3.32) 


using (2.53). The pressure p e (T t ) = NkTJV is then an expression of the saturated 
vapour pressure referred to in Section 3.2.3. Recall that the chemical potential of a 
classical gas, and hence that of the reservoir with which it is in equilibrium, is negative; 
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so this expression describes a saturated vapour pressure that rises with temperature, as 
is found experimentally. 

The state variables F, G and <J> are collectively known as thermodynamic potentials. 
They are used to determine the internal partitioning of energy, volume and particles 
between different subsystems that are coupled to a common environment that is able to 
supply energy, energy and volume, and energy and particles, respectively, as we have 
seen. Technically, the enthalpy H encountered in Section 3.1 is also a thermodynamic 
potential, but not one that finds practical use so easily. Thermodynamic potentials rep¬ 
resent the relevant part of the availability, which in turn is a representation of the total 
entropy of a system and its environment. When a constraint is removed to initiate change, 
it is a rule of Nature that systems evolve to a new equilibrium state that minimises avail¬ 
ability. The second law operates in different guises and a myriad of physical phenomena 
seem to correspond to consequences of the deceptively simple condition that AS) > 0. 


3.3 Maxwell Relations 


James Clerk Maxwell (1831-1879) made seminal contributions to classical thermody¬ 
namics, one of which was the derivation of relations that bear his name, and provide 
connections between derivatives of state variables. They arise from expressions such as 

d F = -pdV + pdN - SdT, (3.33) 


that relate a small change in a thermodynamic potential to small changes in other state 
variables. We derived this example in Section 3.2.1. 

Now, an increment in the function F(V,N,T) may be written quite generally as 
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and therefore we can make identifications 
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(3.35) 


one of which is the expression for the chemical potential derived in (3.21). But as the 
ordering of differentiation in second order partial derivatives is immaterial, such that 



N,T 




(3.36) 


we can deduce that 



(3.37) 


and this is a Maxwell relation. We shall find this one very useful in Section 3.8. 
Another Maxwell relation can be derived starting with 
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(3.38) 
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giving 


Alternatively, we could start with the derivative of the Gibbs free energy obtained 
in (3.27): 

dG = -SdT + Vdp + /idN. (3.40) 


(3.39) 


Maxwell relations that emerge from (3.40) include 

'av\ 
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dp J 
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dT J 
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(3.41) 


Maxwell relations are quite general relations, valid for any substance. They are very 
powerful for establishing connections between different thermodynamic properties, as 
we shall see. 


3.4 Nonideal Classical Gas 


The monatomic classical ideal gas gave us the means to explore various phenomena 
involving entropy production in Chapter 2, but it is a very simple system. In order to 
understand the thermodynamic behaviour of more realistic substances, we need to employ 
more complex model equations of state. We shall begin this development by considering 
the virial expansion. The ideal gas equation of state p — nkT is valid when the particle 
density n = N/V is small, such that particles are typically far apart and the assumption 
that the particles do not interact with each other is a reasonable approximation, if the 
interactions are short range. In contrast, the virial expansion consists of expressing the 
pressure of a nonideal gas as a power series in n: 


^ = (3.42) 

i = l 


where B { = 1, and B, for i > 2 is the ;th virial coefficient, taken to be a function of 
temperature and fitted to experimental data. The series when truncated at the second 
virial coefficient B 2 is explored further in Section 3.8. 

Johannes van der Waals (1837-1923) proposed an equation of state designed to 
represent the properties of gas and liquid phases of a fluid in a single expression. It 
takes the form 



(V - bN) = NkT, 


(3.43) 


where a and b are positive parameters. The argument for the inclusion of the term 
proportional to the b parameter rests on noting that particles are not points, but possess a 
finite volume, a consequence of which is that the centres of mass of the particles are only 
able to move within a volume that is smaller than the container volume V. The reduction 
in available volume is proportional to N, and the volume deficit per particle is b. 

The motivation for the term containing the a parameter is that the particles interact 
with one another, such that the velocity with which a particle hits the walls of the 
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container is reduced by attractive forces from neighbouring particles to its rear. The gas 
pressure is related to the impact velocity, as we saw in Section 2.3, and the argument 
can be adapted to employ a reduced velocity v/ due to deceleration before impact, such 
that p — nm(v x 2 ). Now, the reduction in squared velocity is proportional to the loss of 
kinetic energy, which is the gain in potential energy per particle on moving from the 
bulk of the gas to the wall. Writing v 2 — v~ oc n, assuming that the change in potential 
energy is proportional to the number of particles within a specified neighbourhood of the 
particle, we conclude that p = nm(v 2 ) — an 2 , where a is a positive constant. Accepting 
this modification together with the effective change in confining volume yields the van 
der Waals equation (3.43). We shall revisit this derivation in Chapter 14 when we have 
acquired further tools from statistical thermodynamics. 


3.5 Relationship between Heat Capacities 


Heat capacities not only tell us how the temperature of a system increases as heat is 
supplied, but also represent properties of the system entropy. The difference between 
heat capacities C p and C v at constant pressure and constant volume is of particular 
interest. Starting with the first law for the quasistatic delivery of heat and work to a 
system with fixed N, we write d Q = dE + pdV = TdS and so 


C v 




where the notation of (2.13) is slightly extended, and 
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relating C p to a derivative of enthalpy. But we can also write 
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and then 

rds - r (^L dT+r (wL dv - Ci - dr+ a4i) 

using the Maxwell relation (3.37), such that 
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(3.48) 


showing that the difference in heat capacities is related to properties of the system 
equation of state p(N, V , T). For an ideal gas, for example, with p — NkT/V , we have 
(dp/dT) NV = Nk/V = p/T and (9 V /dT) p N = Nk/p so we find that C p — C v = Nk. 
For most substances C p > C v , since the pressure and volume increase with temperature 
under the conditions of the partial derivatives on the right hand side of (3.48). 
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3.6 General Expression for an Adiabat 

A relationship between state variables describing a system with constant entropy is called 
an adiabat, or isentrope, and we can derive the slope of such a line in the T — V diagram. 
The relationship between constant volume and constant pressure heat capacities can be 
inserted back into (3.47) to give 


TdS = C v dT + (C p - <V)(£) dV, (3.49) 

and a condition for increments of V and T that leave the entropy of a system constant is 
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where y is the ratio of heat capacities C p jC v , such that 
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(3.51) 


specifies adiabats in the T — V diagram. 

For an ideal gas, we have (dT/dV) p N = p/Nk = T/V and (dT/dV) SN = —(y — 
1 )T/V which for constant y can be integrated to give T V 7 1 = constant, or equivalently 
pV Y — constant. This is the form taken in (2.9), where the ratio y = 5/3 applies. If we 
define the isobaric coefficient of thermal expansion as u = V 1 (dV/dT) v , which 
is evidently the fractional increase in volume of a substance with a fixed number of 
particles, due to an increase in temperature at constant pressure, then the slope of an 
adiabat for an arbitrary substance is given by 


(d T\ _ y(V, T) — 1 
V9V/s,iv _- Va(V,T) ’ 

where we explicitly note the dependence of both y and a on V and T. 


(3.52) 


3.7 Determination of Entropy from a Heat Capacity 


Since entropy seems to be a central player in thermodynamics, it seems odd that we 
are not more familiar with instruments that can measure its value, in the manner of a 
thermometer. We now address how physical data might provide such an instrument, and 
emphasise that for such an instrument to work, there needs to be a reference system with 
known entropy. 

Since the Clausius integral (2.23) is the most fundamental definition of entropy change, 
we should start by considering heat transfers and heat capacities. We write, using (3.44): 
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or from (3.46) 

S(N,p,T)-S(N,p,T 0 )= [ 
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and these could lie behind the operation of an ‘entropy-ometer’ device to measure system 
entropy indirectly from measurements of temperature and volume or pressure. In order 
to calibrate it, heat capacities would need to be determined over the range T 0 to T. The 
most suitable temperature for the reference system is T 0 = 0 K, where the entropy of any 
system is defined to be zero according to the third law. It is interesting to contrast this 
with the reference point for the measurement of temperature, the triple point of water, 
which is rather more accessible to experiment. 


3.8 Determination of Entropy from an Equation of State 


Heat capacity data can provide the system entropy as a function of temperature as 
discussed in Section 3.7, while the volume or pressure dependence can be extracted 
from an empirical equation of state p(N ,V ,T). We write 


S(N,V,T)-S(N,V 0 ,T) 



(3.55) 


using the Maxwell relation (3.37). As an example, consider the virial expansion of 
the pressure of a monatomic nonideal gas, truncated after the second virial coefficient: 
pV = NkT(l +B 2 (T)N/V), such that \B 2 (T)N/V\ « 1. Then 
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The most appropriate reference case lies at V 0 -» oo, when the properties of 
the gas approximate to those of an ideal gas and S(N,V 0 ,T) —> S [g (N,V 0 ,T) = 
Nk In [V 0 (kT) 3 / 2 /cN] according to (2.22), such that 


S(N,V,T) = Nk In 
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is the first correction to the ideal gas entropy. A similar procedure could be used to 
obtain the pressure dependence of the entropy. 

It is of interest to extract further conclusions from the virial expansion. Using (3.35) 
a difference in the Helmholtz free energy can be written as 


F(N,V,T) - F(N,V 0 ,T) = - f pdV' 
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(3.58) 
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and so the change in energy A E = A F + T AS is 

, 9 d B 2 (T) (l 1 \ 

E(N, V , T) - E(N , V 0 , T) = -N 2 kT 2 —^- -—j , (3.59) 

and we can use the same monatomic ideal gas reference case when V 0 —> oo, namely 
E(N, V 0 , T) = (3/2 )NkT, to give 


E(N,V,T ) = -NkT 
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N 2 kT 2 d B 2 (T) 
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d T 


(3.60) 


The volume dependence indicates that after a free expansion of such a gas, there will be 
a temperature change, in contrast to the behaviour seen in Section 2.6 for an ideal gas. 

The energy expression can be interpreted in the following way. The extra contribution 
is a potential energy of interactions between the particles, represented by the temperature 
dependence of the second virial coefficient. The first term is of course the kinetic energy 
of the system. If we insist that the potential energy is independent of the kinetic energy, or 
equivalently the temperature, then we require that 


B 2 {T) = h- 
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kT ’ 


(3.61) 


where a and b are '/’-independent parameters, such that dB 2 /dT = a/kT 2 and the extra 
contribution to the energy is independent of T. Then the energy and entropy of the 
gas are E{N, V , T) = (3/2)NkT - aN 2 /V and S(N, V, T) = S ig (N , V,T) - bN 2 k/V. 
Furthermore, we can use this model for Z? 7 (T) to write the equation of state in the form 

NkT ( bN aN \ , , 

/;= —h + —J =► (p + an 2 )(\ + bnT 1 = nkT, (3.62) 


where n = N/V is the particle density, which takes a form very reminiscent of the van 
der Waals equation of state (3.43). 

The manipulations of the various expressions in the last few sections illustrate that 
unexpected connections can be made between measurable material properties. This has 
been the principal role played by classical thermodynamics since its inception, but the 
unfortunate aspect is that it involves a considerable amount of calculus, which can be 
quite unsettling. There have not been many diagrams that help in understanding. Perhaps 
the effort has yielded results that might appeal only to a constructor of Carnot engines, as 
it has allowed us to specify the shape of adiabats or equivalently the T and V dependence 
of entropy, as illustrated in Figure 3.4. Not that there is anything intrinsically wrong with 
an intense interest in entropy! But in the next section, we encounter phenomena that are 
much more dramatic and physically important: transformations of phase. 


3.9 Phase Transitions and Phase Diagrams 

A discontinuous phase transition is a fascinating phenomenon whereby a small change 
in a system parameter, such as temperature, produces a large change in properties of the 
system. Familiar examples include the melting of a solid to produce a liquid, and the 
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Figure 3.4 Entropy as a function of T and V for constant N, showing how the gradients are 
related to the heat capacity and the equation of state. 


boiling of a liquid to produce a gas, transitions that occur at a particular temperature 
for a given pressure. The key point is that at the transition temperature, two (or more) 
phases, or distinctly different states of the system, are in coexistence. This means they 
are in equilibrium with respect to the transfer of particles from one to the other, and 
this implies that they must have the same chemical potential: we have learnt that this 
is the indicator of equilibrium between systems able to exchange particles. We need to 
calculate the chemical potential of different phases and determine the conditions where 
they become equal. We have already established the chemical potential of the monatomic 
ideal classical gas in (2.53). We now need to consider other phases, such as liquids and 
solids. 

3.9.1 Conditions for Coexistence 

Consider a system containing two phases of the same substance that are able to exchange 
particles with one another, illustrated in Figure 3.5 for gas and liquid phases, in conditions 
where volume and energy can be exchanged with an environment while the system 
pressure and temperature are fixed. We showed in Section 3.2.4 that when the system 
has evolved to an equilibrium, it allocates TV, of its particles to the gas phase and /V 2 to 
the liquid phase, in order to minimise the total Gibbs free energy as a function of these 
variables. We write G(N,p,T) = G 1 (W 1 ,p,T) + G 7 (N n ,p,T) and minimise it to give 
/x | d/V| + p 7 dN 2 = 0 where p , 2 are the chemical potentials of the two phases. Since 
/V| + /V 2 is fixed, cl/Y, = — d N 2 , and this means that the chemical potentials of the two 
phases inside the system should be equal. 

We found in Section 3.2.4 that the chemical potential of a system is related to its 
Gibbs free energy in the following way: 


t,p 


p = 


9G 

dN 


(3.63) 
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Figure 3.5 A gas and liquid in coexistence under external constraints of temperature and pressure. 
Evaporation and condensation occur as shown until both phases have the same chemical potential, 
= /x 2 , as well as the same pressure and temperature. 


Note that the state functions E, S and V are all extensive, that is, proportional to system 
size, while p and T are intensive. Thus G = E — TS + pV is also an extensive state 
variable and we recall that it is a natural function of N, p and T. The consequence of this 
is that we should be able to write G(N,p,T) cx N or G(N,p,T) = K(p, T)N, where 
K(p,T ) is a function to be determined. In fact K(p,T) = (dG/dN) Tp = [i and hence 

G = n(p, T)N. (3.64) 

Thus the chemical potential is the Gibbs free energy per particle, and it may be regarded 
as a natural function of pressure and temperature. Now we explore the consequences of 
the coexistence condition /X| (p, T) = n 2 (p, T). 



Figure 3.6 Chemical potentials of two phases (say liquid and gas) as a function of pressure at a 
given temperature. The phase with the lower chemical potential at a given pressure and temperature 
is thermodynamically stable, while the other is metastable. If the pressure exerted on a system is 
increased past the coexistence pressure, at constant temperature, a phase transition from phase 1 
to phase 2 occurs. 
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3.9.2 Clausius-Clapeyron Equation 

The chemical potentials of two phases typically depend on p and T in quite different 
ways. But at a specified temperature, there should be a pressure at which they become 
equal, and where phase coexistence is possible. This is illustrated in Figure 3.6. Changing 
to a different temperature will shift the value of the coexistence pressure. The plot of 
coexistence pressure against temperature is an example of a phase diagram, a description 
of conditions under which different phases of the system are thermodynamically stable. 

We can establish some properties of a boundary between phases on a p — T diagram 
using (3.27) and (3.64). We write 

dG = —S d T + V dp + pdN — d(pN ) = pdN + N dp, (3.65) 

and so 

dp = —sdT + vdp (3.66) 

specifies how the chemical potential of a phase depends on the pressure and temperature. 
This is known as the Gibbs-Duhem equation, where s = S /N and v = V/N denote the 
entropy and volume per particle, respectively, also known as the specific entropy and 
volume. The Gibbs-Duhem equation implies that 



and it is the different specific volumes of the two phases that give the two curves in 
Figure 3.6 different (positive) slopes, normally obliging them to intersect somewhere. 
For example, if phase 1 is a gas, then its specific volume is greater than that of a 
liquid phase 2, and the gradients of the chemical potentials reflect this. Figure 3.6 also 
indicates that if the system pressure is lower than the coexistence pressure, then phase 
1 is selected, as this has the lower chemical potential and partitioning all the particles 
into phase 1 minimises the Gibbs free energy. As the pressure crosses the coexistence 
pressure, a phase transition takes place and the system assumes phase 2, again in order 
to minimise the Gibbs free energy. 

We now consider a temperature T and pressure p where the chemical potentials p , and 
p 2 of phase 1 and phase 2 are equal, such that the point (p, T) lies on a boundary in the 
phase diagram, as illustrated in Figure 3.7. The phase boundary at a given temperature 
corresponds to the pressure in Figure 3.6 where the two curves of chemical potential 
intersect. 

Now, if we change the temperature to T + dT and the pressure to p + dp, the chemical 
potentials of the two phases change according to the Gibbs-Duhem equation: 

dp X 2 — ~ s \,idT + v i 2 d p, (3.68) 

where the suffices label each phase. In order that the new point (p + dp, T + dT) also 
lies on the phase boundary, the chemical potentials of the two phases should remain 
equal. We therefore require that the increments are equal, that is, dp { = dp 2 , implying 
that the phase boundary in the p — T diagram is specified by the following relationship 
between dT and dp: 


(^! — .y 2 )dr = <Ti — v 2 )dp. 


(3.69) 
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Figure 3.7 Thermodynamic analysis that leads to the Clausius-Clapeyron equation for a bound¬ 
ary in a phase diagram. The chemical potentials and /x 2 are equal under conditions p and T, 
and remain so when they change by dp. { and d p*,, brought about by changes in conditions d T 
and dp. 


This leads to the so-called Clausius-Clapeyron equation for the coexistence pressure 
P C (T ): 

Hn c, _ s 

2 (3.70) 


dr 


Vi - v 2 


The entropy difference per particle in the numerator on the right hand side can be 
related to the specific latent heat of the transformation L. This is the heat that must be 
provided per particle to drive a quasistatic phase transformation at the temperature T. 
Using the Clausius relationship between quasistatic heat transfer and entropy change, we 
have sj — s 2 = L/T. We employ a convention that phase 1 is obtained from phase 2 by 
the addition of a positive latent heat: phase 1 has the higher entropy per particle under 
conditions of coexistence. Hence 


d Pc _ L 

d T T(v { — v 2 ) 


(3-71) 


For a gas-liquid phase boundary, with gas labelled as phase 1 and liquid as phase 2, we 
can make the approximation vq v 2 to reflect the difference in density (as long as we 
consider conditions well away from the critical point), and use iq = kT/p such that 


d P e = LePe 
dr kT 2 ’ 


(3.72) 


where the notation now follows that employed in Sections 3.3.3 and 3.3.5 to represent 
the saturated vapour pressure. Assuming that the specific latent heat of evaporation L e 
is independent of temperature and pressure, this may be integrated to give 


p,m=p„w (-f (}-£)). (373) 

and so the saturated vapour pressure takes a form reminiscent of (3.32). For a phase 
boundary between gas (phase 1) and solid (phase 2), the result (3.73) also applies, but 
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Figure 3.8 Phase diagram of a typical material, with phase boundaries specified by various 
applications of the Clausius-Clapeyron equation. L e and L s are respectively the latent heats of 
evaporation and sublimation per particle. 


with a specific latent heat of sublimation L s specifying a saturated vapour pressure p s ( T ) 
with respect to the solid phase. 

For the phase boundary between solid and liquid, (3.71) takes the form 


dpf = L f 
d T T Av’ 


(3.74) 


where L { is the specific latent heat of fusion, and Av is the difference in volume per 
particle associated with melting solid (phase 2) to liquid (phase 1), which is roughly tem¬ 
perature and pressure independent. The phase boundary is then Pf(T) = (L f /Av) In T + 
constant, and as Av is usually rather small, it is much steeper than the gas-to-condensed 
phase boundaries. For ice, which contracts on melting, Av < 0 and the slope of the phase 
boundary is negative, but for most substances it is positive. A phase diagram of a typical 
material therefore takes the form illustrated in Figure 3.8. The point at the intersection 
of the three boundaries, at which solid, liquid and gas phases are in coexistence, is the 
triple point. 


3.9.3 The Maxwell Equal Areas Construction 

Some model equations of state, such as the van der Waals equation encountered in Section 
3.4, give rise to an unphysical ‘loop’ (actually a wiggle) on a p — V plot, as shown in 
Figure 3.9. We have used dimensionless coordinates pb/kT to represent pressure and 
V /Nb to represent volume, and chosen parameters such that a = 4bkT to illustrate the 
behaviour. The ideal gas law is shown using the same coordinates to indicate that the 
pressure is reduced to reflect the interparticle attraction. Such a wiggle is seen in the 
equation of state when the temperature lies below a certain threshold and it is clearly 
nonsense. What kind of gas increases its pressure when the volume is increased, which 
is suggested for specific volumes V /N between about 2b and 5b? This is an artefact of 
the model, arising from the assumption that the fluid is homogeneous. 

The wiggle actually tells us that for the given temperature, the system separates into 
phases at two different densities, indicated by two specific volumes at the same pressure. 
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These are represented by the outermost of the three points of intersection between the 
equation of state and a line of constant pressure: the middle one of the three turns out 
to be mechanically unstable. But how can we determine the pressure p e at which there 
is equilibrium between the two phases representing liquid and gas? 

A neat solution to this problem is to employ (dp/dp) T — v, the Gibbs-Duhem 
equation (3.68) for the dependence of chemical potential on pressure at constant tem¬ 
perature. This can be integrated from v, to vy to give 

J dp = J. vdp =$■ p f - Hi = d(pv) — j pdv. (3.75) 

We consider states with specific volumes vy and v,- that coexist, and so by definition 
they have equal chemical potentials such that the left hand side of (3.75) is zero. Also 
the pressure p e of the coexisting phases must be the same, so f j d (pv) = pj vy — p i vy = 
p e (Vf — v ( ) = p t ff dv. Using (3.75) we can therefore conclude that p e , and v ( - satisfy 



(p(v) ~Pe)dv = 0 . 


(3.76) 


On a p — V plot this condition is readily interpreted geometrically, as shown in Figure 
3.9 for the van der Waals equation of state. The coexistence pressure and associated coex¬ 
isting specific volumes of the phases are determined by a condition of equality between 
the two areas defined by the three intersections between the p — V curve provided by the 
equation of state and the horizontal line at p — p c . This is called the Maxwell equal areas 
construction. The equilibrium equation of state then properly consists of the horizontal 



Figure 3.9 Isotherm of the van der Waals equation of state with parameter choice a = 4 bkT, 
together with the Maxwell equal areas construction that determines the specific volumes v = V /N 
of the coexisting liquid and gas phases at a given temperature. The ideal gas law is shown as a 
dashed line for comparison. 
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line at the coexistence pressure, together with the van der Waals equation for v > vy and 
v < v,. A system with a volume per particle somewhere between v, and vy will consist 
of an appropriate mixture of the two coexisting phases. 

3.9.4 Metastability and Nucleation 

We have just seen that some model equations of state are inappropriate for the range 
of specific volumes (or equivalently system densities) between those of the coexisting 
phases. But in fact the Maxwell construction and phase diagrams such as Figure 3.8 
are also misleading. All phase diagrams tell only part of the story because they fail to 
take into account the physical difficulty a system might have in carrying out a transition 
between phases as demanded by the second law. A familiar example is the case of 
diamond: a phase of carbon that is stable at high temperatures and pressures but is also 
found at room temperature and pressure, where graphite is supposed to be the stable 
phase. 

The diamond phase under such conditions is called metastable. It fails to comply with 
the second law because it cannot easily achieve the necessary atomic rearrangements to 
move towards a minimum of the free energy. One way to view this is that the necessary 
initial stages of the rearrangement correspond to structures with a higher free energy, 
and can only be achieved through a thermal fluctuation, apparently a temporary violation 
of the second law. The system is locked into an inappropriate phase by what is called a 
nucleation barrier. The diamond phase is metastable at room temperature: its transition 
rate into graphite is negligibly small. 

Similarly, it is possible to compress a gas isothermally beyond the coexistence pressure 
suggested in Figure 3.9. It approximately follows the curve beyond this point and 
becomes supersaturated. Equivalently, the system can pass across the phase bound¬ 
ary from the gas into the liquid region of Figure 3.8, but remain gaseous. Again, the 
difficulty is that although the liquid phase has a lower chemical potential and is thermo¬ 
dynamically more stable, the tiny droplets that need to be formed in order that the new 
phase can emerge are less stable than the mother phase of supersaturated gas: they have 
a higher chemical potential. It is therefore improbable that the system should rearrange 
itself to form small droplets: there is a nucleation barrier. In certain circumstances, gases 
can be compressed to multiples of the saturated vapour pressure, without condensing. 

The freezing of a liquid can be impeded by a nucleation barrier. Water can be super¬ 
cooled to —40 C in spite of the insistence in phase diagrams that it should freeze at 
zero Celsius. In order to achieve this supercooling, the water must be free of impurities 
or solid surfaces that can nucleate ice. All this is evidence that thermodynamic pro¬ 
cesses are sometimes controlled by events that take place between the initial and final 
equilibrium states, and these are not always macroscopic in scale. 


3.10 Work Processes without Volume Change 


Work does not always just involve the mechanical compression of a system: this is the 
standard example but other cases exist. For solids, work can be performed by distorting 
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the shape without changing the volume. For example, consider the extension dL of a wire 
under an applied force f accompanied by a lateral contraction of the wire to conserve 
volume. If the process is quasistatic, !F describes the equilibrium tension of the system, a 
state variable analogous to pressure. We write d W = SEAL and add it to the term —pdV 
that describes a volume change. Similarly, the distortion can produce a change d«A in 
the surface area of the system without a change in the volume. Work of this kind is 
performed if we squeeze a spherical droplet into an oblate shape. The corresponding 
quasistatic work term is written as d W = Fd-A where the coefficient T, known as the 
surface tension, is again a state variable that describes a system in equilibrium. 

The first law d E = AQ + d W and the Clausius expression for an entropy increment 
then yield an extended fundamental relation: 

AE = TAS - pAV + pAN + EAL + Td,A. (3.77) 

The work terms take the form XAx, where X and x are intensive and extensive variables, 
respectively. State variables such E and S, for example, then acquire a dependence on 
the new variables L and .A. It can become rather complicated: this is why volume work 
is almost always the standard example considered! 


3.11 Consequences of the Third Law 

Like the other classical laws of thermodynamics, the third law is a statement of empirical 
observation. We have seen in this chapter how the entropy of a system may be recon¬ 
structed through measurements of heat capacities, equations of state, or derived from 
chemical potentials inferred from phase coexistence. Nearly all this data is compatible 
with the idea that as T —» 0, the entropy of a system goes to a constant, and furthermore, 
that it is the same constant for all systems. A very few systems appear to have a capacity 
for nonzero entropy even as the temperature goes to zero, although it is not clear how 
this behaviour might be exploited. For compatibility with the statistical interpretation of 
entropy, to be discussed later, the standard reference value of the entropy of systems at 
T — 0 K is taken to be zero. 

The principal implication of the third law is that absolute zero temperature is unattain¬ 
able. This can be illustrated in Figure 3.10, where two possible scenarios involving the 
entropy function S(T,V) are shown. On the left, the system entropy at T = 0 depends 
on the volume V, whereas on the right it does not: the third law is satisfied in the second 
case. We can imagine sequences of isothermal and adiabatic processes involving changes 
in volume that drive the system along the zigzag path between functions S ( T , V \) and 
S ( T , V 2 ) starting from a finite temperature. If the system did not satisfy the third law, 
then after a few compressions and expansions shown on the left, the system entropy 
would be reduced to its value at some V between the two extremes, and at a temperature 
of zero. For the system that satisfied the third law, in contrast, this approach to zero 
would require an infinite number of steps. Of course there are many technical difficulties 
in cooling a system, such as the elimination of leakage of heat from the environment, but 
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S(T,V 2 ) 


T 



Figure 3.10 The diagram on the left shows how a temperature of absolute zero can be reached 
if a system has an entropy function that depends on volume at T = 0. The two curves represent 
the system entropy as a function of temperature at two volumes with V l > V 2 . The zigzag path is 
followed by a sequence of isothermal compressions (moves downward), followed by quasistatic 
adiabatic expansions (moves to the left). In contrast, if 5 (0, Vj) = 5 (0, k 2 ) as shown on the right, 
then an infinite sequence would be needed to reach absolute zero, making it unattainable. 


the third law states that we cannot entirely remove all the thermal energy in a system, 
even in principle. 

But if we accept the third law as a necessary boundary condition for models of entropy, 
then we must conclude that the entropy expression for the ideal gas (2.22), on which 
we have built much of our intuition, cannot be correct. It quite clearly does not satisfy 
the third law, tending towards —oo rather than zero as T -> 0! However, the model 
is classical: we need not despair. We might have expected (2.22) to fail in the low 
temperature limit since we know that classical physics should then be replaced by a 
quantum mechanical treatment. In later chapters, we shall employ a more appropriate 
quantum model of the ideal gas, within a framework of statistical mechanics, and find 
that low temperature behaviour that satisfies the third law emerges, along with some 
unexpected richness in phenomena. 


3.12 Limitations of Classical Thermodynamics 

Rather than focusing on its limitations, perhaps we should celebrate the successes of 
classical thermodynamics, only some of which have been addressed in this chapter. The 
very power of the approach is that it does not particularly rely on specific assumptions 
about the interactions between the particles in a system. The nature of these interactions is 
inferred from measurements of macroscopic properties such as an equation of state. The 
principal value of the discipline is that experimental measurements can be related to one 
another in ways that are not at first apparent, for example the connection between heat 
capacities and mechanical properties in (3.48). Indeed this represents the very purpose of 
theoretical studies, whereby making a few assumptions about the way the world works 
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can lead to the establishment of connections between phenomena and the revelation of 
some unifying principles underlying the complexity we see around us. 

But we now seek to go beyond classical thermodynamics and its applications. At the 
heart of the subject is the key property of entropy, through which many of the connections 
arise, it seems. But just what is entropy? Classical thermodynamics cannot provide a clear 
and satisfactory answer. Statistical thermodynamics was developed precisely in order to 
provide such an understanding, and in the next chapter we study the core ideas of this 
approach. 


Exercises 


3.1 Taking the enthalpy density of an incompressible liquid to have the form h = AT + 
Bp, where A and B are positive constants, show that the entropy generated per unit 
volume of fluid flowing through a plug with a drop in pressure from p i to p Q is given 
by v(A/B) In [1 + B(p i — p 0 )/(ATp\ where I] is the initial temperature. 

3.2 The isothermal compressibility k t , the thermal expansion coefficient at constant 
pressure a and the heat capacity at constant volume C v of a system are defined as 


Kj — 


l 

v 



a 


V 



and 


C v 



where constancy of N is understood. Show that the gradient of an adiabat of the 
system on the T — V plane is given by 


/3 T\ _ Ta 
\dV ) s CyKj 

You may use the identities 



z 




= -1 


and 




Hence determine the equation of an adiabat in the T — V plane for (a) an ideal gas 
and (b) a substance with constant intensive thermal properties a T and k t , and a 
temperature independent but extensive C v . 

3.3 Derive a Maxwell relation involving the quantity (dS/dp) T and check that it is 
satisfied by an ideal gas. 

3.4 Express the Gibbs free energy G and enthalpy H of an ideal gas in terms of p, T 
and N. Show that in general (d(G/T)/dT) p N = — H/T 2 and demonstrate that this 
result is satisfied by an ideal gas. 

3.5 A system can take two phases, liquid or solid. Above the melting temperature, 
which phase has the lower chemical potential, and why? At which temperature do 
both phases have the same chemical potential? 

3.6 Starting from the fundamental relation of thermodynamics, show that 


1 

T 



P_ 

T 


dV + - 
T 



d T, 


dS = 


v 
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and derive expressions for (3 S/dV) T and (3 S /3 T) v . Hence show that 

_ T 2 ( d(P/T) \ 

\3v) t V 3 T ) v 


3.7 


Use this relationship to demonstrate that the energy of an ideal classical gas does 
not change when it is expanded or compressed at constant temperature. Show that 
the energy of a van der Waals gas does change on isothermal expansion. Does it 
increase or decrease? Argue physically in support of the direction of change that 
you deduce. Calculate (3 S/dV) T for the van der Waals gas. 

Show that the difference in specific heat capacities (the heat capacity per unit mass) 
can be written as 

Ta 2 


C P — c v 


PKj 


where p is the mass density. Search online for data on the physical properties of 
liquid water and hence estimate the ratio ( c p — c v )/c v . 

3.8 Write down the Helmholtz free energy of a neutral plasma of /V H atoms of hydrogen, 
N e electrons and N p = N e free protons at a fixed system volume and temperature, 
modelling all three components as monatomic ideal classical gases, but taking into 
account the fact that the energy of the hydrogen atom lies a dissociation energy 
€ > 0 below that of its separated components. Note that the thermal de Broglie 
wavelength of an electron /. th e differs from that of a proton, but that the thermal de 
Broglie wavelengths of a proton and a hydrogen atom are approximately the same. 
Regarding /V H as an unconstrained internal variable, and imposing the constraint 
/V H + N p = constant, show that the densities of the three components at a given 
temperature are related by the Saha relation: 



corresponding to a balance between the sum of chemical potentials on either side 
of the reaction H ^ e + p- The fraction of ionised hydrogen in the photosphere 
(the visible outer layer) of the sun, where the temperature is about 6000 K and the 
particle density approximately 10 23 m is about 10 4 . Estimate the ionised fraction 
if the plasma were heated to 2 x 10 4 K at the same density, using e = 13.6 eV. 





4 

Core Ideas of Statistical 
Thermodynamics 


Statistical thermodynamics is an attempt to relate the phenomenological, macroscopic 
laws of classical thermodynamics to an underlying quantitative picture of molecular 
behaviour. Statistical mechanics is a broader term that includes dynamical systems not 
usually treated in thermodynamics, such as star clusters or even crowds of people. But 
since thermodynamics is applied to macroscopic quantities of gases, liquids and solids, 
containing of the order of 10 23 particles, this task appears rather daunting. How can we 
establish the behaviour of this number of particles? 

However, we clearly do not need to go so far because thermodynamic systems seem 
to be well enough characterised by just a handful of macroscopic quantities (energy, 
pressure, etc.) and the relationships between them. If we were to take the trouble of 
determining the behaviour of every molecule, then our efforts would be magnificent but 
pointless 1 : the motion of every molecule surely cannot affect the equation of state. The 
basic assumption of statistical thermodynamics is that the detail of the molecular motion 
is irrelevant. This leads to the argument that a study of the likely or the average behaviour 
of the particles is good enough. The statistical properties at the microscale are therefore 
our focus of attention, and this requires us to review our understanding of probability. 


4.1 The Nature of Probability 

It is often stated that if we roll a die, there is a probability of 1/6 that a six will be thrown. 
Around this apparently simple statement has raged a couple of centuries of philosophical 
debate. 

When we know there to be a variety of possible outcomes of some event, but we 
cannot determine which will happen, probability is something we use to weight the 
outcomes in order to define our expectation. The probabilities we assign to the outcomes 

1 Rather like Samuel Johnson’s view on cucumbers: ‘It has been a common saying of physicians in England, that a cucumber 
should be well sliced, and dressed with pepper and vinegar, and then thrown out, as good for nothing.’ 
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could be nothing more than a guess, in which case we would be creating an expectation 
that might not correlate very well with the actual event. Ideally, we should use some data, 
or information, to construct a set of probabilities that allow us to make good judgements 
about the future behaviour. This line of thought is called information theory, wherein 
probabilities are used as a basis for logical reasoning (see Jaynes in Further Reading). 
On the other hand, it doesn’t sound like a very unique way of proceeding: how can we 
arrive at the best judgement? 

Another point of view is that we should determine the actual frequencies with which 
each event should turn up in a number of trials, and use these to weight the outcomes. 
Then it would seem that we could derive an expectation that is tied to real documented 
behaviour, albeit from the past. However, the problem is that this is a unique approach 
only if we run an infinite number of trials. Otherwise, the frequencies would only be 
estimates. If we do not have time for that many trials, there are sophisticated ways 
of estimating the errors but we essentially revert to making a judgement about the 
probabilities, on the basis of the limited set of data. 

The viewpoint that probabilities represent a distillation of our best judgement has 
some advantages. If we examine a die and reckon that it is symmetrical or true, we can 
make a judgement that each face is as likely as any other to come up in a throw. We 
have a basis for saying that the probability of each outcome is 1/6. This might be wrong: 
the die might be unfair. If so, then a few trials will provide us with information that 
will allow us to revise our probabilities in some way. If we were to cast the die a large 
number of times, then a true die would generate frequencies of the various outcomes 
that converge towards 1/6, in which case our initial guess was a good one. Or perhaps it 
is rigged and always throws a 1, indicating that our model of the system is flawed: our 
judgement was in error, our expectations are wrong and need to be revised. 

Whether the numerical values are generated by some sort of judgement, or from trial 
data, the probabilities that we actually use to weight the events have to satisfy the same 
rules of arithmetic, so that the distinction need not bother us too much for the present. 
The basic point, though, is that an average over a set of probabilities might just be a 
best guess or a hypothesis based on a model, and that the accuracy of the model should 
be tested. 

We can express an intuitive understanding of probability in the form of the following 
statements: 

1. For each possible outcome i, there is a positive probability P(i) denoting the statistical 
weighting (or limiting frequency if you prefer) for it to arise in a trial, such as the 
rolling of a die. 

2. The sum of the P(i) for all possible outcomes of a trial is unity, that is, ^TP(i) — 1. 

3. Intuitively, outcomes related through some symmetry in the system should have the 
same probability; hence to begin with, we guess that the probabilities for each outcome 
of the die roll are the same and equal to 1/6. 

4. The probability that either outcome i or outcome j should occur is given by the sum 
P(i ) + P(j), as long as the two outcomes are mutually exclusive. So the probability 
of throwing a five or a six is 1/3. 

5. There is a probability P 2 (i,j) of joint outcomes, for example outcome i as well as 
outcome j. If the events are uncorrelated, meaning that the outcome of one trial is 
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unaffected by the outcome of another, then the joint probability is the product of 
individual probabilities: P 2 (i,j) = P(i)P(j). The trials are said to be independent. 
If the outcomes are correlated, this factorisation does not apply. Thus the probability 
of throwing two sixes with two normal dice is (1/6) x (1/6) = 1/36. If, on the other 
hand, the dice are connected in some spooky way, such that they never produce the 
same number, then the probability of two sixes is zero. The two outcomes in this 
case are correlated. This idea extends to longer sequences, P 3 (i,j,k), and so on. 
The joint probability can be expressed as a product of the probability of the first 
outcome times a conditional probability of the second outcome given the first, that 
is, P 2 (i,j) = P(i\j)P(j). The vertical line should be read as ‘given’. The idea of 
independence is that P (i [j ) = P(i). 

The P(i) form a histogram that expresses the relative likelihoods of the outcomes of a 
trial. The histogram can be generalised to a continuous function if there is a continuum 
of possible outcomes. A cubic die has six faces and therefore six possible outcomes 
for a throw. A rolled pound coin, on the other hand, has a continuum of points on the 
circumference that might be uppermost when it stops (assuming it does not fall over!). 
We define p(d)d6 as the probability that the uppermost point on the circumference lies 
between angles 6 and 9 + d9 measured from a vertical with respect to the Queen’s head 
for example, where p(9) is a probability density function or pdf. If the coin were perfectly 
circular, then we would guess p(6) = 1 /2tc, as normalisation in this case would require 
that f~* p(9)d9 = 1. This is analogous to our guess, based on the symmetry of the cube, 
that a die should roll a six with probability 1/6. 

Arguably, the most important statistical properties of a set of probabilities, or probabil¬ 
ity distribution, are the mean and the standard deviation. If a variable n is characterised 
by a discrete probability distribution P(n) over its possible numerical values, then the 
mean is written using angled brackets as 


(n) = E nP (n), 

n 


(4.1) 


where the sum is over the set of possible outcomes. The standard deviation a is defined 
as the square root of the variance given by cr 2 = ((n — (n)) 2 ); the mean square deviation 
from the mean. Thus, the mean throw of a true die is 3.5, with a standard deviation of 
about 1.7. The latter gives an indication of how much deviation from the mean we might 
expect to see in a typical measurement. This should not, however, mask the fact that the 
probability of each outcome in this case is the same! 

For a continuous variable x characterised by a pdf p(x), the mean and variance 
are defined by (x) = fxp(x) dx and cr 2 = f (x — (x)) 2 p(x)dx with the integrals 
performed over the entire range of possible values of x. Note that a 2 = f (x 2 — 2x(x) + 
(x) 2 ) p{x)dx = (x 2 ) — (x) 2 . We can define the mean of a function/(x) of the variable 


x as 


</(*)) = j f(x)p(x)dx. 


(4.2) 
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where the value of the function for an outcome x is weighted by the probability 
of producing that outcome. We have already considered cases f(x ) = x and 


f(x) = (x - (x» 2 . 


The die and the coin are characterised by uniform probability distributions over orien¬ 
tation, but in many cases we see distributions with a peak. The most common distribution 
of this kind is called a Gaussian or normal distribution. It is entirely characterised by 
the mean and standard deviation. It is the ‘bell shaped curve’ specified by 



(4.3) 


which is normalised such that p G (x)dx = 1. For example, the distribution of marks 
in an exam is expected to be approximately Gaussian (for a large class!) so a script 
selected at random from a stack of scripts should receive a mark in the range x to x + dx 
with probability p G (x)dx, and the statistics are summarised by the two parameters (x) 
and a alone. 

In statistical thermodynamics, we aim to establish a probability distribution or a pdf 
characterising aspects of a system’s behaviour. These aspects could be microscopic, such 
as the values of particle velocities and positions, or macroscopic, such as spatial profiles 
of particle densities. Such distributions tell us what we need to know about the system’s 
statistical behaviour, such as the mean energy per particle, the mean particle density at 
a given location or the standard deviation of these quantities. The crux of the matter is 
to identify such distributions. It seems that this may be done successfully for a system 
in equilibrium, where the properties are time independent and there are no mean flows 
of heat or particles into or out of the system. The treatment of systems where there are 
such flows, such as a system in the process of cooling down towards equilibrium, on the 
other hand, is much more complicated, and not yet fully understood. We shall return to 
this matter in Chapter 15. 

4.2 Dynamics of Complex Systems 

4.2.1 The Principle of Equal a Priori Probabilities 

The probability that a system variable should take a value in certain range is clearly 
something that depends on the dynamics of the system. For example, if we wish to 
establish the pdf of the momentum of a particular molecule in a sample of gas, we 
ought to start with a consideration of the molecular dynamics. It turns out, though, that 
only rudimentary dynamical understanding is needed, at least for a system in thermal 
equilibrium. 

It helps to steer clear of complicated dynamical systems at first and to focus on small 
discrete systems with simple dynamical rules. Let us start with a dynamical system 
consisting of several compartments, between which some conserved material is shared. 
The material comes in the form of indivisible units such that it is distributed in integer 
quantities between the compartments. The dynamical rule is that every timestep, a spec¬ 
ified number of units of material, depending on the prevailing situation, moves from one 
compartment to another. This is not meant to be a realistic situation, though a version 
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of it approximates to the behaviour of a set of weakly interacting quantised harmonic 
oscillators. 

We consider first a system with two compartments. In fact, let us regard the compart¬ 
ments as two people and that they share a certain number of pound coins. Every couple 
of seconds one of them gives the other £x, according to a definite cashflow rulebook. 
For example, if A has £5 and B has £10, then at the next exchange, £2 is passed from 
A to B. From the resulting situation of £3 : £12, the cashflow rulebook says that the next 
transfer is £3 in the opposite direction. The rules could be of any kind, but let us suppose 
that they are such that the sequence of situations that is generated includes all possible 
arrangements of the £15 between the two people. The dynamics would start from an 
arbitrary point in the sequence, visit all the possibilities, and then it would repeat. 

In the case just described, it would seem fair to say that if we observed the situation 
at an arbitrary later time (and neglected to spot regularities such as the repeat time), we 
should expect to find the cash distributed in any one of the possible arrangements with 
equal probability. The two participants generate all arrangements in a regular sequence 
and at an arbitrary time any one of them might be observed, none with any greater 
likelihood than another. 

We make this more precise now by conceiving of a set of microstates of a dynamical 
system. A microstate in this example might be characterised by the number q A of units 
residing in compartment A and q B in compartment B, with q A + q B = Q, the fixed 
total number of units. The microstates are the states of a system we might observe 
as a snapshot during the evolution and in this case are labelled ( q A ,q B ) : explicitly they 
form the set (Q,0), (Q — 1,1), ... ,(1.0 — 1), (0, Q) illustrated in Figure 4.1. There are 
(Q + 1) such elementary arrangements of the material between the two compartments, 
and a time series of exchanges would correspond to a path linking them in a sequence. 

If the microstates were represented by keys on a piano, a path would consist of a 
sequence of individual notes, using every one on the keyboard, and then repeating. Each 
note would represent a snapshot of the system. It might not be music, but it would be 
dynamic! The probability that a note should be heard at a randomly chosen time is the 
same for all 88 of them. Explicitly, the probability that a system might be found in a 
particular microstate would be equal to 1/^2, where Q is the number of microstates, and 
equal to Q + 1 in this example. 

The set of all microstates of a system, set out as in Figure 4.1, is called the system’s 
phase space. This is an unfortunate use of the same word, phase, that is used to specify 
different states of matter such as solids, liquids and gases. Be aware that a phase space 
is not the same as a phase diagram ! 

Now we distort the logic to apply similar ideas to a dynamical system in the real 
world. Imagine that A and B exchange cash according to rules that we are not sure 
about. We want to predict the future, but on the basis of an incomplete knowledge about 
the rules of the game. How do we proceed? 

All we can do is hypothesise about the effect of those rules and see how it works out, 
and our first guess is that there is no reason to believe the dynamics should favour any 
one configuration over another, as we do not have enough information to make such 
a judgement. This is the least biassed thing we can do! It means we should take the 
probability of finding the system in any individual configuration to be the same. Note 
that this is a probability that is a reflection of our judgement, and is not a frequency over 
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Figure 4.1 Set of all possible divisions of Q items between two compartments A and B. labelled 
by (q A ,q B ) with q A + q B = Q. This forms a phase space of microstates of the system. The 
dynamics takes the system between microstates as shown by the arrows. In this example, the 
hopping rules are such that all microstates are visited before a return to the starting point. 


real trials: we have not done any. Also, note that this is a time-independent assignment 
of probabilities, and so what we are considering here is a hypothesis about the values 
of the equilibrium microstate probabilities. How they behave for a system that is out of 
equilibrium, when the probabilities might depend on time, is another matter entirely. 

Of course, the hypothesis could easily be flawed. A and B might operate rules whereby 
they never give their entire wealth to one another: we would never see them in the 
((2,0) and (0,(2) microstates. They might have favourite configurations, to which they 
return again and again. The nice cashflow rulebook situation that truly gave rise to equal 
frequencies of visits is a rather special case. Nevertheless, the equal likelihood hypothesis 
is a way to start thinking about the system, at least until it can be shown to be misleading 
in some significant way. 

A system with unspecified dynamical rules evolves with some uncertainty, but in a 
similar way the future behaviour of a system with completely specified microscopic 
rules is also hard to pin down, if it is large or complex in some sense. The reason is that 
there will be so many equations of motion to solve and so many initial conditions to 
specify. Even if we had the rules of cashflow between a group of people labelled A-Z, 
we would find it tedious, at the very least, to work out the future wealth distribution. 
Our task would be to judge the probabilities of occurrence of each microstate, given 
some uncertainty in the initial conditions and practical limits in our ability to compute 
outcomes. In physical examples, we would have to follow the position and velocity of 
every particle: we would be at it forever! So rather than give up, perhaps we should 
make things very simple, and imagine as we did for the uncertain cash exchangers A 
and B that all of the microstates are equally likely. What a crazy idea. 

But this is precisely the assumption upon which we base equilibrium statistical 
mechanics. The key word here is equilibrium, meaning that an appreciable amount of 
time has elapsed since the last external disturbance to the system, during which the 
dynamics have a chance to take the system through a reasonable selection of all the 
available microstates. The system is then considered to have settled down and acquired 
statistical properties that are time-independent. 

Think of a gas of weakly interacting particles, into some of which an amount of 
kinetic energy (heat) is injected. Before long, the energy will be shared out amongst 
all the particles as a consequence of collisions, such that there is no mean gradient in 
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temperature across the system and no gross time dependence in mean properties such 
as density: in short, an equilibrium situation has been reached. We don’t quite know the 
rules of the dynamics and we don’t know the initial configuration of all the particles 
when the energy was injected, but intuition suggests that this does not matter. It is in 
these situations that we boldly apply our assumption of equal microstate probabilities. 
This is known as the principle of equal a priori probabilities. If we observe an isolated 
system in equilibrium at an arbitrary time, we claim that it will be found in any one 
of its microstates with equal likelihood. Notice that part of this hypothesis is that the 
system is isolated and left to sort itself out. The phrase a priori refers to the fact that 
we are basing this hypothesis on the information available to us ‘at the outset’, which 
is actually very little. Now we are in a position to make statistical statements about the 
system properties and then to test them to see how the hypothesis fares. 

There are significant problems when it comes to justifying this principle in realistic 
physical systems. On the positive side, it is known that if a system evolves according to 
classical or quantum mechanical Hamiltonian dynamics, and if the probability of every 
microstate is initially the same, then they will remain equal in the future, a result known 
as Liouville’s theorem. Effort has gone into investigating the ergodic hypothesis', the 
idea that such dynamics really do take a system into each and every microstate with 
equal frequency after an infinite amount of time. This has produced some supportive 
conclusions, but actually it rather misses the point, as the principle is only a working 
hypothesis: a simple way of arriving at predictions of behaviour, and we should be 
prepared to modify it if need be. 

In addition, even if the dynamics were ergodic, they would generate equally frequent 
visits to each microstate only after an exceedingly long time; at least as long as the time 
needed to see the original configuration restored (this is called the Poincare recurrence 
time and it can be estimated to be greater than the age of the universe for even quite 
small physical systems). Only a very small sample of the microstates will actually be 
visited during a particular observation period. It is seriously crazy to claim that each 
microstate of a realistic dynamical system in equilibrium will actually turn up with 
equal frequency on making measurements over a finite period. But that does not stop us 
from using the principle of equal a priori probabilities as a model of the world: we just 
have to remember that the probabilities assigned to the microstates are not frequencies 
but representations of our best judgement, based on insufficient data, but guided by ideas 
of even-handedness in the absence of full information. 

4.2.2 Microstate Enumeration 

At this point, we put dynamics aside and focus more on the consequences of the assump¬ 
tion of equal microstate probabilities. Let us extend the model we introduced in the last 
section. For two compartments, the number of microstates is Q + 1, a number that clearly 
increases with Q. If we add more compartments, for a constant Q, we increase the num¬ 
ber of microstates further. If we have three compartments, for example, the microstates 
may be labelled by the respective occupancies (q t , q 2 , qf) of the three. These can be 
visualised as points on a triangular surface inclined with respect to a 3-d set of Cartesian 
axes corresponding to the numbers qj, as illustrated in Figure 4.2. 
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Figure 4.2 A system of three compartments that together hold Q items has microstates that lie 
in a triangular pattern on the indicated shaded plane. The axes indicate the number of units q, in 
each compartment. The vertices lie at points (2.0.0), (0,2,0) and (0,0,2)- 


The sum of the qj is equal to Q, and this locates the vertices of the triangle at points 
(2,0,0), (0, Q,0) and (0,0, Q). If Q = 1, these are the only points on the surface. More 
generally, the number of points goes like the area of the triangle, and in fact is given 

by the triangular number 1 + 2 H- + (2 + 1) = (2 + 1 )(2 + 2)/2. Notice that this 

is proportional to 2 2 for large Q. If we now consider four compartments, the number 
of microstates with a given Q corresponds to the number of points on or beneath this 
surface. Each plane corresponds to a constant value of 2i23 = ft + ft + ft- g' ven that 
the fourth compartment possesses Q — 2i23 items. Considering the 3-d geometry, the 
number of microstates labelled by (q { , q 2 , < 73 , ft) is therefore proportional to 2 3 for 
large Q. We begin to see a pattern here and might guess that the number of microstates 
is proportional to Q N ~ l for large Q. As N and Q increase, this rapidly becomes an 
absolutely huge number. 


4.3 Microstates and Macrostates 

We have enumerated the microscopic states of our system and judged that when it is 
isolated, the dynamics will lead to time-independent and equal probabilities of microstate 
occupation. The next step is to recognise that the microscopic detail is not of interest to 
us in thermodynamics, but that instead we are chiefly concerned with the behaviour of 
macroscopic properties, gross features of the system that are discernible on a macroscopic 
scale. They are collective properties of the components of a system. 

The principle of equal a priori probabilities can tell us how likely it is that a particular 
value of a specified macroscopic property is observed. We arrange the microstates into 
groups according to the available values M a of the macroscopic property. Each group 
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would then be called a macrostate: a collection of microstates with a common specified 
property. A macrostate corresponds to our perception of the system on the macro¬ 
scopic scale. The number of microstates in a particular group, labelled a, is called 
the microstate multiplicity Q (/ of that macrostate. The microstate multiplicity of the 
macrostate, divided by the total number of microstates £2 = is then the probability 

that the macroscopic property should take value M a once the isolated system has come 
to equilibrium. 

Let us illustrate this with our system of three compartments. The labelling of each 
microstate corresponds to the three numbers ( q x ,q 2 ,q^) that specify the number of items 
in each compartment. Let us define the ‘spikiness’ of the microstate as the difference 
between the highest and the lowest qj\ 

Sp = max({^.})-min({^}), (4.4) 

and as the name suggests, Sp tells us whether the pattern of distribution of the items 
has highs and lows, or is roughly flat. Thus, the microstate (1,3,5) has a spikiness of 
four and the microstate (3,3,3) has a spikiness of zero. Note that these two microstates 
correspond to a system with total <2=9. 

Figure 4.3 illustrates all 55 microstates of this system, colour coded according to 
spikiness. The microstates form the phase space of the system; in this case, it is a 
triangular plot corresponding to the diagonal plane intersecting the 3-d axes illustrated 
in Figure 4.2. This is the pattern of points that lie on the shaded triangle. Each corner 


▲ 

▲ A 
AAA 
AAA A 
A A A A A 

AAAAAA 
A A A A A A A 
A AAAAAA A 
AAAAAAAAA 
AAAAAAAAAA 


▲ Sp = 
A Sp = 
Sp ■ 
Sp : 
A Sp : 
A Sp = 
A Sp : 
A S P = 
A S P = 


Figure 4.3 Illustration of the phase space of a system of three compartments possessing nine 
items. We can imagine laying this pattern over the shaded triangle in Figure 4.2. Each small 
triangle is a microstate, and the colour coding divides the space into nine macrostates of different 
spikiness Sp defined in (4.4). 
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Figure 4.4 Histogram of microstate multiplicity of spikiness macrostates in the N = 3, Q = 9 
system. 


corresponds to microstates labelled (9,0,0), (0,9,0) and (0,0,9) and the point in the 
centre is microstate (3,3,3). At any instant in time, the system is in a configuration 
represented by a point in this phase space, and as it changes configuration it moves in 
the phase space according to dynamical rules that we do not need to specify. The groups 
of microstates that all have the same spikiness, and hence have the same colour in the 
diagram, are the macrostates a of interest to us. 

The contours of different spikiness form roughly hexagonal shells starting at the centre 
of the triangle. If a measurement of spikiness were to be made of the system when in 
equilibrium, the principle of equal a priori probabilities would allow us to predict the 
probability distribution across the range of possible values of spikiness. In this case, 
there are nine values and therefore nine macrostates with respect to this property. The 
histogram of microstate multiplicities of the macrostates is given in Figure 4.4. It 
would seem that a measured spikiness of five is the most likely or modal outcome; and 
more specifically, the mean value of spikiness for this distribution is 5.18. 

We observe that the detail of the microstate, in the form of three numbers, has been 
subsumed into a single collective property Sp, and its value characterises the macrostate 
to which the microstate belongs. The number of macrostates so identified (9) is signif¬ 
icantly fewer than the number of microstates (55). If we were to increase the number 
of compartments to four at the same value of Q, the range of spikiness, and hence 
number of macrostates, would remain about the same, but the number of microstates 
would become much larger. When we can only measure macroscopic system properties, 
the uncertainty regarding the microscopic detail is typically enormous for a system with 
many components. 

The central assumption of equilibrium statistical thermodynamics is that if we are con¬ 
cerned with the macroscopic scale, the detail of the dynamics at the microscopic scale is 
not too important. The laws of motion need only make it likely that the system is found 
amongst its macrostates in proportion to the relevant microstate multiplicity, in other 
words, to make the occupation probability of the available microstates roughly equal. 
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Then statistical information about macroscopic properties can be obtained. The remark¬ 
able thing is that real systems seem to behave in just this manner and the implications 
can deepen our understanding of the principles of classical thermodynamics. 


4.4 Boltzmann’s Principle and the Second Law 

We now come to Boltzmann’s key insight, and the formula on his gravestone. He 
suggested that the number of microstates available to a system is related to its entropy 
S. He proposed the connection 

S = k In £2, (4.5) 

where ft is the number of microstates and k is the Boltzmann constant, is sometimes 
known as the statistical weight but the term microstate multiplicity better conveys its 
meaning as the number of microscale manifestations of the system. An important inter¬ 
pretation that follows is that entropy has something to do with the uncertainty at the 
microscopic scale when we only specify the state of a system on the macroscopic scale. 
According to the principle of equal a priori probabilities, Q determines the probability 
that an isolated system might be found in one of the microstates, namely 1 /Cl. 

What is the motivation for this expression? Crucially, Boltzmann recognised that 
would increase if a constraint on the dynamics were lifted. Recall that for each available 
macrostate of a system there is a corresponding microstate multiplicity. Each macrostate 
would comprise the entire phase space if the system were confined to it through a 
constraint, brought about perhaps by some aspect of the dynamics. Suppose we were to 
set up our example system in a particular macrostate, say the Sp = 9 macrostate of the 
N = 3, Q = 9 case. In other words, the initial microstate of the system would correspond 
to one of the corners of the triangle in Figure 4.3. The system phase space would initially 
comprise just these three microstates. The constraint on the dynamics that confines the 
system to such a phase space would be that only transfers between compartments of nine 
units at a time are allowed. 

But now consider the lifting of this constraint such that the dynamics can transfer 
smaller amounts: the system would then be allowed to assume other microstates within 
the triangle. If the dynamics allowed single unit transfers, for example, the system 
would move to an adjacent microstate in the triangle at each timestep in a deterministic 
but complicated manner. After a period of time, according to the principle of equal a 
priori probabilities, the system would be equally likely to be found in any one of the 
55 microstates. It is clear that the release of the constraint, and the associated change 
in the dynamics, allows the system to explore a phase space with a larger number of 
microstates. The removal of a constraint cannot reduce the number of accessible states; 
it can only increase it. This is so reminiscent of the second law that Boltzmann proposed 
that thermodynamic entropy was related to the microstate multiplicity. 

If so, then the requirement that entropy is extensive pins down the functional form of 
this relationship. If we have a system with microstate multiplicity f2, then the multiplicity 
of a combination of two identical systems would be f? 2 : for each microstate of the first 
replica, there are Q microstates of the second. Boltzmann came up with his logarithmic 
expression because the entropy of the pair of systems would need to be twice that of a 
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single system, and ,S' pair = k In !T2 2 = 2k In Q = 2>S , single works as required. Boltzmann’s 
entropy function is extensive, as it needs to be. 

Of course, we have to demonstrate that Boltzmann’s expression reproduces known 
results for the thermodynamic entropy of a system such as the ideal gas expression 
(2.22), otherwise we would have to reject it in spite of these appealing properties. It will 
take us a while to reach that point in this book, but (spoiler alert!) in Section 9.6, we 
shall prove it to be so and indeed identify the unknown constant c in (2.22). Boltzmann’s 
principle will emerge as a very fruitful hypothesis. 

An isolated ideal gas can be used to illustrate the ideas. We describe the gas using the 
dynamically conserved total energy and number of particles as macrostate variables, and 
the volume of the container provides the constraint on the dynamics of the particles. The 
entropy according to Boltzmann is a measure of the multiplicity of microstates associated 
with choices of these conditions. We can elaborate this by grouping the microstates into 
a set of macrostates corresponding to various distinct macroscopic configurations of the 
gas. For example, we might consider macrostates specified by the volume within the 
container that is actually occupied by the gas. Now, consider a gas confined to one half 
of the container by a partition. Initially it is in equilibrium, with a uniform mean density 
throughout its side of the partition. But if the constraining partition were removed, the 
gas would suddenly find itself in a low multiplicity macrostate with respect to the new 
constraint: the volume available to the gas has become bigger and macrostates with 
density profiles with higher multiplicity are newly available. It would begin to evolve. 

The gas will expand, of course, after the partition is removed until it is once again 
homogeneously distributed in the enlarged volume. More precisely, the principle of 
equal a priori probabilities states that once equilibrium is re-established, the gas will 
be found in the available macrostates with probability proportional to the multiplicity 
of the underlying microstates. For the gas released from behind a partition, it turns out 
that the macrostate with uniform density in the larger container will have the highest 
microstate multiplicity of all possible density profiles. The microstate multiplicity of this 
macrostate will make the largest contribution to the total multiplicity. Note carefully 
that the Boltzmann entropy is related to the total number of accessible microstates, not 
just the number associated with the uniform density macrostate. Other arrangements with 
lower multiplicity might be observed as an occasional fluctuation away from the uniform 
density situation. Indeed it is conceivable that this might include the return of the gas 
into its original half of the container, but the likelihood of this happening is negligible 
if the number of particles in the gas is large. 

If there is a macrostate that dominates the available phase space, then to a good 
approximation the system will simply adopt that macrostate under the new constraint. 
Its state variables will be those that characterise that macrostate. Notice that a system in 
equilibrium is still very dynamic at the microscale and to a lesser extent at the macroscale, 
but that the probabilities of occupation of micro- or macrostates are independent of time, 
which is the deeper meaning of the idea of an equilibrium state. This is Boltzmann’s 
theory. A similar picture of new and old phase spaces in another context is given in 
Figure 4.5. 

An alternative viewpoint is to consider that the probabilities of microstate occupation 
change on the release of a constraint. The probability of observing a density profile of 
gas that extends across the larger volume was zero before the partition was removed. 
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Figure 4.5 The oddly familiar shapes represent macrostates of a physical system. Initially, the 
system is constrained to lie within the green macrostate labelled P, but after the removal of 
the constraint it explores the broader phase space along the blue trajectory. The most likely 
macrostate visited when in equilibrium is macrostate Y, assuming that statistical weight in 
that situation is proportional to area. Note that the phase space of a gas of N particles in a 
given volume would extend across 6 N dimensions, corresponding to their positions and veloc¬ 
ities, not just two as in this representation. Source: Adapted from Daniel Dalet, d-maps.com, 
http://d-maps.eom/m/angleterre/angleterre56.pdf. 


and later on it became nonzero. Similarly, the probability of occupation of half the 
container was unity to begin with, but evolved towards a very small but nonzero value 
after the partition was removed. This suggests that entropy increase is associated with a 
change in equilibrium probabilities, equivalent to a change in knowledge or information 
represented by the macrostate variables and constraints. 

The equilibrium probability distribution of any macroscopic quantity follows by 
identifying the possible macrostates and weighting their contribution to averages 
according to their respective multiplicity, or statistical weight. In order to develop these 
ideas further, we now discuss ensemble theory. 


4.5 Statistical Ensembles 

We have seen that a gas released from behind a partition is able to explore various new 
macrostates, and we proposed that it would be found in each macrostate with a probability 
proportional to the microstate multiplicity of that macrostate, once it has settled into a 
new equilibrium. Developing earlier ideas of Maxwell and Boltzmann, Gibbs suggested 
that the statistics arising from a single system evolving through the macrostates were 
equivalent to those of a collection of systems, each a realisation of a micro- or macrostate. 
The time-averaged properties of a single system when in equilibrium are equivalent to a 
weighted average of the properties of these copies of the system, the weighting being the 
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appropriate equilibrium probabilities. The collection of such system snapshots is called 
an ensemble. 

As an example, for our system of N = 3 compartments sharing Q = 9 units, we could 
imagine an ensemble of 55 systems, each arranged in one of the microstates, and work 
out the ensemble average of spikiness. Each microstate would be equally weighted, 
according to the principle of equal a priori probabilities, and the average would be 



(4.6) 


where Sp, is the spikiness of the z'th microstate, and 1/55 is the weighting of that 
microstate in the ensemble. Alternatively, we could group microstates into the nine 
macrostates and write 



(4.7) 


where Q ly denotes the multiplicity of the ath macrostate. Clearly, the system is assumed 
to be in microstate i with probability 1/^2 or in macrostate a with probability Q ff / C. 
Here, C is the total multiplicity of 55 given by Q = where the £2 a are specified 

in the histogram in Figure 4.4. 

The assumed equivalence of the ensemble average and the time average of a single 
system is actually quite hard to justify. As we have already stated, a time average 
corresponding to a physical measurement of a system at equilibrium corresponds to an 
average over a small sample of the totality of snapshots available in the ensemble. The 
equivalence would be valid as long as these microstates are sampled from the macrostate 
groups in proportion to the size of those groups; attempts to prove this are called ergodic 
theory. 

But if we accept the ensemble approach and the underlying principle of equal a priori 
probabilities, we obtain a very simple strategy for calculating the statistics of system 
properties in equilibrium. We establish the statistical weighting across the micro- or 
macrostates, carry out a weighted average of the system properties, and regard this 
as a prediction of an empirical measurement. We need not concern ourselves with the 
dynamics. We place our faith in the principle of equal a priori probabilities, and proceed 
from there, hoping that comparison with experiment will justify the approach. 

4.6 Statistical Thermodynamics: the Salient Points 

In summary, the core ideas upon which statistical thermodynamics is based are 
as follows: 

• Probability is a means of weighting outcomes in order to work out an expectation of 
future behaviour. 

• The principle of equal a priori probabilities for the occupation of microstates is an 
attempt to capture the effect of the complicated dynamics of isolated systems in 
equilibrium, when the statistical properties are independent of time. 


Core Ideas of Statistical Thermodynamics 79 


• Macrostates of a system are characterised by values of macroscopic state variables, 
but the actual microstate taken at any moment could be one of many possibilities, the 
number of which is called the microstate multiplicity of the macrostate. 

• It is proposed that values of macrostate variables are observed with a probability 
proportional to the corresponding microstate multiplicity, for an isolated system. 

• We can evaluate equilibrium averages of micro- or macroscopic variables by perform¬ 
ing ensemble averages over all possible microstates or macrostates. 

• Boltzmann’s principle is that the entropy of an isolated system in equilibrium is given 
by k In C, where Q is the total microstate multiplicity consistent with the macroscopic 
state variables and dynamical constraints. 

• Following the relaxation of a constraint, the number of available microstates increases, 
and this is the underlying rationale for the second law. 


Exercises 

4.1 Two people A and B together possess £9 and exchange cash between them in units 
of £1. Determine the change in Boltzmann entropy that arises when a third person 
C joins in, according to the assumptions of complex system behaviour we have 
discussed. 

4.2 A, B and C share £3. List the 10 microstates in terms of labels (q\,q 2 , q 2 ), where qj 
is the number of pound coins possessed by the yth participant. Plot a histogram of the 
microstate multiplicity of each spikiness macrostate. Calculate the mean and stan¬ 
dard deviation of the spikiness, assuming the principle of equal a priori microstate 
probabilities. 

4.3 A, B and C share £9 but exchange it in batches of £3. Determine the Boltzmann 
entropy of the system and the probability of observing the most likely, or modal, 
spikiness. They change the rules to allow exchanges of £1. Calculate the new entropy 
and the probability of observing the new modal spikiness. 

4.4 A has £3 but then starts sharing it with B, C and D. Calculate the probability under 
the new situation that A should hold the £3 again. 


5 

Statistical Thermodynamics 
of a System of Harmonic 
Oscillators 


We now apply the principles developed in the last chapter to a reasonably realistic 
physical system: a set of N weakly coupled quantum harmonic oscillators that collectively 
hold Q quanta of energy. We will construct ensembles and obtain statistical information 
about the system when it is in equilibrium. 

A quantum harmonic oscillator with natural frequency co has energy levels E q = 
(q + 1/2 )hu> where q is a non-negative integer. We shall ignore the zero point energy 
ha)/!, and denote the number of quanta held by the /th oscillator by the integer qj. 
We then have total energy E = hw^-qj, such that Q = <y ( is proportional to E. The 

oscillators are weakly coupled, in the sense that quanta can be exchanged between them, 
but without complicating the specification of the system energy. The system is very 
similar to the example of compartments sharing units of material introduced in the last 
chapter and we illustrate four oscillators, represented as springs, in Figure 5.1. 

5.1 Microstate Enumeration 

The number of microstates with total number of quanta Q may be evaluated for this 
system using a neat trick. Unfortunately, it is not so easy for other systems! We have 
to distribute Q quanta amongst N oscillators. Each microstate of the system might be 
pictured as a set of N groups of objects, the jth group consisting of qj objects, with the 
total number of objects equal to Q. If we line up the Q objects next to each other and 
insert N — 1 dividers to separate them into N groups, the situation might resemble the 
arrangement shown in Figure 5.1. The key point is that each possible microstate of the 
system (a specific set of labels {< 3 ^-}) corresponds to an arrangement of these Q + N — 1 
elements. A different choice of positions of the dividers would correspond to a different 
set of qj. For example, in Figure 5.1, there are seven objects and three dividers, making 
ten elements in all. The pattern corresponds to q x = 3, q 2 = 4 and q 3 = q 4 = 0, roughly 
represented by the amplitudes of oscillation shown for the springs. 
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Figure 5.1 Illustration of four quantum oscillators holding 3, 4, 0 and 0 quanta, respectively. 
The microstate is represented by the sequence of seven objects (circles) and three dividers (lines) 
shown. The possible microstates of seven quanta held by four oscillators correspond to different 
linear arrangements of these ten objects. 


So how many arrangements might there be? We can solve this problem by working out 
the number of ways we can place the N — 1 dividers on a set of Q + N — 1 positions. 
The objects then take the positions not occupied by a divider, and can be ignored. For 
example, one such way is to have the N — 1 dividers in the first N — I positions, and 
the Q objects in the rest. This corresponds to the microstate (0,0, • • ■ , Q) in the notation 

There are Q + N — 1 positions available for the first divider, Q + N — 2 positions 
for the second and Q + N — (N — 1) positions for the last one, giving a multiplicity 
of arrangements of (Q + N — l)(Q + N — 2) • ■ ■ (Q + N — (N — 1)). But this is the 
number of ways a set of different dividers can be arranged and therefore overcounts 
the true multiplicity since the dividers are in fact indistinguishable. We can correct the 
result by dividing by the number of possible arrangements of the N — 1 dividers on the 
N — 1 occupied positions, which is (N — l)(N — 2) • • • 1, and so the correct microstate 
multiplicity of the system is 

_ (Q+N - 1)(<2 + AT - 2) • • • (<2 + AT - (AT - 1)) _ (g+#-!)! 

C (N — l)(N — 2) ■■■ l Q\{N - 1)! 

(5.1) 

We can check this with N = 3, Q — 9 to obtain the multiplicity of 55 considered in the 
previous chapter. 

Factorials crop up naturally in calculations of multiplicity and they are often inconve¬ 
nient in analysis. However, using Stirling’s approximation, factorials of large numbers 
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are readily converted into powers. Stirling’s formula states that for large m: 

In m! » m In m — m. (5.2) 

This may be derived by noting that In m! = In m + In (m — 1) + • • • + In 1 = ^” ? =1 
In n ~ f"' In x dx = [x In x — x]™ ~ in In m — m for large m. So, if both Q and N are 
large, then 

In Sl(N,Q) = In (Q +N - 1)! - In Q\- In (N - 1)! 

* (Q + N - l)\n(Q + N - l) - (Q + N - 1) 

-Q In Q + Q - (N - 1) In (N - 1) + N - 1 
= (Q +N - l)ln(fi + N - 1) - Gin Q - (N - 1)In(iV - 1). (5.3) 
This implies that 

In Q(N,Q) ^ 21n (l + ^^) + (N - l)\n ^1 + -^^, (5.4) 

and by a further approximation, we get 

«(».C)-(l + |)°(l + |)" '• (5.5) 

This indicates that Q can be a very large number, for any ratio of Q to N. Notice that 
the multiplicity of the system for Q N is 

, (5-6) 

and this is consistent with the result we guessed in Section 4.2.2. 


5.2 Microcanonical Ensemble 

The set of microstates available to a system is known as its phase space. This may often 
be visualised as a coordinate space of high dimension, such as the /V-dimensional space 
labelled by the harmonic oscillator quantum numbers {<y ; }. A snapshot of the system 
is represented by a single point in this space, and the evolution of the system in time 
corresponds to the motion of that point along a trajectory through the phase space. 

The dynamics of an isolated system are constrained by the conservation of energy, 
and the phase space is then characterised by a specific value of energy. According to the 
principle of equal a priori probabilities, the statistics of an isolated system are to be found 
by giving equal weight to every possible microstate in this phase space. Equivalently, 
we can imagine an ensemble of systems each of which takes one of the microstates, 
and then obtain averages of properties across this ensemble, with each member of the 
ensemble given equal weight in the averaging. The collection of configurations under 
consideration here is called the microcanonical ensemble and is designed for studying 
an isolated system. 
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For example, a microcanonical ensemble of the N — 3, Q — 9 oscillator system 
consists of 55 copies of the system, in each of the possible microstates. In order to 
calculate the mean spikiness, we would calculate Sp for each microstate and average 
over the ensemble with equal weighting for each copy. This is exactly what was done 
in Section 4.5. The calculation might be represented as 



(5.7) 


where i labels the microstates, P t is the equilibrium probability that the system is found 
in microstate i and A, is the value of a system property of interest when in microstate 
i. For the microcanonical ensemble, the microstate probabilities are all equal to I / Q. 

5.3 Canonical Ensemble 

The canonical ensemble is a method for calculating the statistical properties of a system 
that is not isolated. It is able to exchange energy with its environment. Consequently, 
it is able to explore a phase space that includes microstates with different energies, 
in contrast to the microcanonical case just considered. However, the weightings of the 
system microstates when performing ensemble averages are now not equal. The word 
canonical is used in the sense of standard. The word microcanonical used to describe 
an ensemble of an isolated system is derived from this and although the terminology is 
not particularly revealing, it has become established. 

Let us see how the canonical ensemble arises by considering a system of N oscillators 
weakly coupled to a larger system of N r = N tol — N oscillators that we shall call the 
reservoir or heat bath. The total energy of the combined system of /V lot oscillators is 
Q m . The reservoir has a microstate multiplicity £2 r (iV r , Q r ) when it has Q r quanta. The 
label r reminds us that it refers to the reservoir. The combination of system and reservoir 
is allowed to explore the (A tot — 1) dimensional phase space of the combined system, 
characterised by a constant energy, and will reach equilibrium corresponding to equal 
probabilities of occupation of any individual combined microstate. The system plus 
reservoir can therefore be studied using a microcanonical ensemble. 

We are interested in the average properties of the system, not the combination of system 
and reservoir. We want to construct an ensemble of just microstates of the system. It is 
therefore sensible to divide the phase space of the combined system and reservoir into 
macrostates labelled by the microstate of the system. We then deduce that each of these 
macrostates has a multiplicity equal to that of the reservoir, given that the system is in 
a particular microstate. 

For example, consider again our /V tot = 3, <2 tot = 9 system, and regard this as a com¬ 
bination of a reservoir (two oscillators) and a system (one oscillator, with label j = 3). 
The combined microcanonical ensemble consists of the 55 microstates of the three oscil¬ 
lators. An ensemble of the single oscillator, in contrast, would comprise all possible 
configurations of the single oscillator, allowing it to take different energies, specifically 
the ten microstates corresponding to the values of q-. from 0-9. 

Now, the macrostate of combined system and reservoir with the system microstate 
specified by g 3 = 0 has a microstate multiplicity of ten. This is because the nine quanta 
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Figure 5.2 Illustration of the construction of the canonical ensemble. The phase space of three 
oscillators with nine quanta is carved up into a ziggurat of macrostates labelled by the microstate of 
oscillator 3, namely by q 3 . The statistics of oscillator 3 (the system) are to be found by weighting 
its microstates by the size of the phase space of oscillators 1 and 2 (the reservoir) when q 3 takes 
a specific value. This weighting is 10 — 9 3 . 


can be shared amongst the two reservoir oscillators in ten ways (corresponding to 
Q r (/V r = 2 ,Q r = 9) = 10!/(1 !9!) = 10). For q 3 = 5, the four remaining quanta are to 
be shared between the two reservoir oscillators and the multiplicity of this macrostate 
of the combined system and reservoir is now O r (N r = 2 ,Q r = 4) = 5!/(1 !4!) = 5. The 
relative weightings of the q 3 = 0 and q 3 = 5 system microstates of the single oscillator 
in this ensemble are therefore 10 and 5. The weightings for other system microstates are 
found to be (10 — q 3 ) by similar reasoning. The argument is summarised in Figure 5.2. 

We may then write mean system properties as: 

<A) = £a( 93 )/>( 93 ), (5.8) 

93 

which is a weighted sum over system microstates labelled by q 3 . The probability P(q 3 ) 
is proportional to the multiplicity of the reservoir when it possesses Q tot — q 3 quanta, 
namely f2 r ()V r , Q lol — q 3 ). The proportionality constant is chosen to ensure the normali¬ 
sation of the probabilities, so in the case considered, P(q 3 ) = (10 — q 3 )/ 55. The average 
number of quanta in the system according to this ensemble is then clearly 

(? 3 > = 2 ^ 43 ^( 93 ) = 2 ^ ^^— 

93=0 

= ^(0 + 9+ 16 + 21 +24 + 25 + 24 + 21 + 16 + 9) = 3. (5.9) 

This answer should be no surprise, since the combination of system and reservoir has 
nine quanta distributed between three oscillators with equal microstate probabilities: the 
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average number of quanta per oscillator has to be three. Using the same probabilities, 
we could work out the mean square of q 3 and hence the variance. 

The important conclusion is that microstate probabilities for an open system are not 
equal. The principle of equal a priori probabilities applies only for isolated systems. But 
by splitting an isolated system into two parts (a system and a reservoir), we have been 
able to deduce the statistical weighting of the microstates of the system. 

We have just seen an explicit example for a small reservoir, but a very important 
universal form of the probabilities of system microstates emerges when we consider 
a single oscillator in contact with a very large reservoir, with N r = N tot — 1 1. As 

before, the system microstate probabilities are 


P(q) oc Q r (N r ,Q tot - q). 


(5.10) 


where q is the number of quanta held by the single oscillator. If we take q Q M for 
almost all microstates, and regard Q { as a continuous variable, we can make the following 
Taylor expansion: 


In Q. r {N v 0 tot - q) 


In 


Gtot) 1 


3 In Q r (N r , Q r ) 
3 Qr 


+ * * * 

<2r=<2tot 


(5.11) 


and ignore terms in q 2 and beyond. We expand the logarithm of £2 r and not itself for 
two reasons. First, we wish to obtain an approximate representation where the reservoir 
multiplicity and hence the system microstate probability P(q) is always positive, and this 
is guaranteed if we expand the logarithm. Secondly, !T2 I .(/V 1 ., <2 tot — q) is a very rapidly 
varying function of q and so an expansion of the more slowly changing logarithm is 
likely to be accurate over a wider range of q. 

From (5.3), we can determine the derivative in (5.11) to be 


= 3 In ^ r (A r ,6 tot ) 

Qr=Q,ot 9 e.ot 

= In (<2 t ot + — 1) + 1 — hr <2 tot — 1 ^ In (l + ^) , (5.12) 

V Utot / 

where we have introduced a symbol /3 to denote the derivative. Since depends on 
N lol /Q lot , it is clearly a property of the combined system and reservoir, and effectively 
of the reservoir alone, to a good approximation, since the system is a tiny component 
of the combination. From (5.10)—(5.12) we are then able to write the system microstate 
probabilities as 

P(q) oc exp (— $q). (5.13) 


3 In E2 r (N r , Q r ) 

30 r 


When we considered the mini-reservoir with N r — 2, we deduced a linear relationship 
between the probability of each single oscillator microstate and the number of quanta 
held, as employed in (5.9). For very large N r , this goes over to an exponential decrease. 
This is the so-called canonical probability distribution and it is illustrated in Figure 5.3. 

Canonical ensemble averages of system properties can now be obtained by performing 
summations such as 

(A) = ^-^A(^r)exp (- Pq ), 
q 


(5.14) 
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Figure 5.3 The canonical probability distribution P(q ) oc exp (— f)q ) of quanta q held by a single 
oscillator in thermal contact with a large reservoir characterised by yS = 1. Note the logarithmic 
scale. 


where 

Z = H exp 

q 


(5.15) 


is a normalising factor; A(q) is the value of a system property when the system is in 
the microstate q, and the sum is taken over all microstates of the system labelled by q. 

For example, consider a single oscillator that is part of a large system of N tot oscillators 
characterised by parameters N tot 'J>> 1, <2 tot 1 and Q lot /N tot = 3. Then from (5.12), we 
find that /3 = In (4/3) and the average number of quanta held by the single oscillator is 

I Qtot 

{q) =-^qexp (5.16) 

X q =o 

The system has Q tol + 1 microstates, and if Q tot 'A> 1, the upper limit in the sum can effec¬ 
tively be replaced with oo. We define x = exp (—/3) such that Z = x q = (1 — x) -1 . 
Similarly 


„ d ^ - dZ dZ dx 

) q exp (-/8q) — -^ > exp (- fiq ) =- - = - - -„ = -(1 - x) “(-x), 

Y d P o d P ^ d P 

(5.17) 

and so 


(q) =Z ‘^qrexp (■ -0q) = 


(1 — x)x exp (— P) 

(1 -x) 2 ~ (1 - exp (-£))' 


(5.18) 


For yS = In (4/3), (q) = 3. Once again this is as we expect, since the system plus reservoir 
is simply N tot oscillators carrying Q lol = 3/V tot quanta. We have recovered the mean 
number of quanta per oscillator, but this time by considering the canonical distribution 
of a single oscillator over its microstates. 
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5.4 The Thermodynamic Limit 

The canonical ensemble describes the statistical properties of a system that is able to 
exchange energy with a very large reservoir. The thermodynamic limit refers to behaviour 
as the system becomes larger and larger. Some important conclusions emerge. 

For a large system, it can make sense to consider its probability distribution over 
macrostates instead of microstates. For example, the system energy, proportional to the 
number of quanta in our example of oscillators, might be used as a system macrostate 
label. In the single oscillator system just considered, there was only one microstate 
corresponding to each system energy, but in more complex systems there will usually 
be more than one. 

Let us consider a system consisting of N oscillators in contact with a reservoir char¬ 
acterised by parameter fi. Each system macrostate is labelled by the number of quanta 
Q it holds, and has a microstate multiplicity of £l(N,Q). The system moves between 
macrostates as time progresses as a consequence of exchanges of quanta with the reser¬ 
voir that alter Q. The canonical ensemble average over macrostates may then be written 
in the form 

y OO 

(A) = — A(Q)£2(N , Q) exp (-0Q), (5.19) 

Z Q =o 


where 

OO 

Z = J2 «(#,G)exp (-0Q). (5.20) 

Q= o 

Note that this is simply a reorganisation of the sum over microstates. Each microstate with 
Q quanta has canonical weighting exp(— PQ) and we have grouped all such microstates 
(Q(N, Q) of them) to form one of the macrostate terms in the sum in (5.19). 

We saw in (5.5) that £2 is a power-like, rapidly increasing function of Q for an oscillator 
system. On the other hand, the factor exp(— PQ), with p > 0, is a rapidly decreasing 
function of Q. As a consequence, the macrostate weighting factor Q(N, Q) exp(— PQ) 
increases with Q until it reaches a peak and then falls, as exponential suppression at 
large Q is stronger than power-law amplification. 

It is very instructive to expand the weighting factor around the peak in the distribution 
at Q = Q* ■ This is the largest or modal probability over the macrostates. We write the 
probability distribution as 

P(Q) — Z _1 exp (In Sl(N,Q)-$Q), (5.21) 


and determine Q* from the condition dP/dQ = 0, equivalent to 3 In P/dQ — 0, or 


3 

dQ 


(In Q(N, Q) — 


$Q) 


Q=Q* 


= 0 , 


from which we deduce that 


(5.22) 


3 In Q(N,Q) 
~1)Q 


Q=Q 


(5.23) 
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Let us recall that ft is effectively a property of the reservoir and that it is defined 
in (5.12) as a derivative of the logarithm of reservoir multiplicity with respect to the 
number of quanta it holds. The system property that appears on the left hand side in 
(5.23) has a similar form, and we might call it the system /3-parameter, or fi s , for the 
system macrostate at the modal number of quanta Q*. Each macrostate of the system 
has a property j3 s (Q) = 3 In Q/dQ that is a function of Q. We conclude from (5.23) 
that when it is in equilibrium, the system is most likely to be found in the macrostate 
that has the same /I parameter as the reservoir. This is very reminiscent of the equality 
of temperature between a system and reservoir in thermal equilibrium, which we will 
consider later. 

For this particular case, and assuming the system parameters N and Q are large, (5.12) 
and (5.23) may be used to identify Q* in terms of /): 


9 9 

— In £2(JV, Q) « — ((Q + N - 1) In (Q + N - 1) - Q In Q) = In 
oQ oQ 


Q + N - 1 


Q 


so from (5.23) 

P S (Q*) = In 


Q*+N - 1 
Q* 


Q* = 


N - 1 


exp 06) - 1 


(5.24) 


(5.25) 


Next we look at the extent of fluctuations away from this most likely or modal macrostate. 
We expand the first term in the exponent in (5.21) about Q*: 


In Q(N,Q) * In Q{N,Q*) + (Q - Q*) 

(Q-Q*) 2 3 2 In Q(N,Q) 


3 In Q(N,Q) 


3 Q 


Q=Q * 


3 Q 2 


(5.26) 


Q=Q* 


and recognising through (5.23) that the second term on the right hand side is (Q — (Q *) /f 
we get 


(Q-Q*) 2 3 2 In Sl(N,Q) 


In £i(N,Q)-PQ »ln £2(N,Q*) ~ PQ* + 


2 dQ 2 

and so we can approximate the system macrostate probability as 
P(Q) oc £2(fV, <2) exp (~$Q) ~ Q(N,Q*)e\ p (-j3Q*) exp 
where we define 


Q=Q* 


*\2 


(Q - Q*) 

la 2 


1 


d 2 In £l(N,Q*) 


3 Q 


*2 


(5.27) 


(5.28) 

(5.29) 


According to this approximation, the canonical pdf over macrostates is Gaussian and 
the modal macrostate labelled by Q* is also the mean macrostate characterised by (Q), 
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since P(Q) is symmetric about Q = Q*. The normalisation sum Z may be written 

POO 

ZK / d<2 £2(N,Q) exp (~$Q) « Q(N, (Q)) exp 

Jo 


f 


d£> exp 


(6 - ( 0 ) ; 
2cr 2 


£2(iV,(e))exp (-P(Q))(2Tt)2a. (5.30) 


Z is therefore well approximated by the largest term in the sum in (5.20), multiplied by 
a factor of (2tt) 1/2 ct. 

Let us examine further the standard deviation er of the distribution. We find from 
differentiating (5.24) that 


o 


2 


'a 2 In £2(N,Q*y 
dQ* 2 


Q*(Q*+N - 1) 
N - 1 


(5.31) 


and for N 1, the ratio of the standard deviation to the mean is proportional to /V 1 ' 2 : 


= ( Q* + N y ~ / exp (fa y 

Q* V N Q* J ~ V N ) 


(5.32) 


having used (5.25). This means that the distribution becomes very sharply peaked as 
N -> oo. This is illustrated in Figure 5.4, where for clarity Q is scaled by its mean 
to ensure that the peak does not move as N changes. In this so-called thermodynamic 
limit where N is large, the fluctuations in macrostate are unlikely to be observable, 
and the system can be regarded as having a constant number of quanta Q* related 
to the /3 parameter of the reservoir through (5.25). Thus, on the macroscopic scale, a 
system described by the canonical ensemble seems to be static, although underlying 
this apparent quiescence is a continual exploration of microstates. It is simply that the 
exploration rarely strays away from the Q = Q* macrostate. 



QKQ) 


Figure 5.4 Probability distribution function of system parameter Q for two system sizes N, 
with Q scaled by the mean of the distribution, indicating how the relative width of the distribution 
shrinks as the system size increases. In the thermodynamic limit N —* oo, the distribution becomes 
sharply peaked and fluctuations are extremely unlikely. 
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5.5 Temperature and the Zeroth Law of Thermodynamics 


We have seen that the microstate probabilities of a system of harmonic oscillators in 
the canonical ensemble depend on a property ft that characterises the large reservoir of 
oscillators with which it is in thermal contact. We now define an analogous property 
of systems or reservoirs that are more general than collections of harmonic oscillators. 
We write 

9In Q(N,E) 

P =-(5-33) 

oE 


where Q is the microstate multiplicity of a macrostate of a system labelled by energy 
E and number of particles N. For oscillators, the number of quanta was proportional to 
energy and so the ft parameter of (5.12) matched this form. The ft parameter has dimen¬ 
sions of inverse energy, while ft was dimensionless. But following a similar development, 
we can show that the most likely energy macrostate of a system has the same value of ft 
as the reservoir with which it is in equilibrium. In the thermodynamic limit, this modal 
macrostate is essentially the only macrostate ever explored by the system. Therefore, a 
large system and reservoir, when in equilibrium under heat exchange, are characterised 
by the same parameter ft. Other systems in equilibrium with the reservoir will all be 
characterised by this parameter. This suggests that ft is an indicator of thermal equilib¬ 
rium between macroscopic objects, and must therefore correspond to some function of 
temperature. 

Let us now insert Boltzmann’s formula S = k In Q, implying that 


1 dS(N,E) 1 
k 8E ~~ kT’ 


(5.34) 


as long as S(N,E) is indeed equivalent to the thermodynamic entropy, as Boltzmann 
claimed, and where we have employed (2.45). Now we see that ftk is the inverse tem¬ 
perature, and so it is no surprise that it plays the role of indicator of thermal equilibrium. 
On the basis of statistical thermodynamics, we seem to have found a justification of the 
zeroth law of thermodynamics. 

Thus, when a system is in equilibrium with a reservoir or heat bath, the fixed tempera¬ 
ture T of the heat bath imposes two related conditions. Firstly, all microstate probabilities 
of the system take a canonical form proportional to exp (— E/kT ), where E is the 
microstate energy. This is called the Boltzmann factor. Secondly, the most likely energy 
macrostate of a large system has the same temperature as the heat bath, where macrostate 
temperature is defined in terms of a derivative, with respect to energy, of the microstate 
multiplicity of the macrostate. Furthermore, energy and temperature fluctuations of a sys¬ 
tem usually become unobservable as the system gets very large: this is the thermodynamic 
limit, so called because such behaviour ties in with the concept of time-independent 
macrostate variables in classical thermodynamics. 


5.6 Generalisation 

The example of a system and reservoir of harmonic oscillators studied in this chapter has 
given us an illustration of the following general approach, regarded as the central doctrine 
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of statistical physics. The canonical ensemble is a collection of all possible microstates of 
a system, of which there might be an infinite number, each weighted by a canonical proba¬ 
bility P(Ej ) = Z -1 exp (— E i /kT), where Z = JT exp (— EJkT ) and E { is the energy of 
the microstate. Such an ensemble provides the statistical properties of the system when 
in thermal equilibrium with an environment at temperature T. We can also construct 
a canonical ensemble over macrostates, such that P(E a ) = Z 1 £2(E a ) exp (— E a /kT ), 
where Z = (£.’„) exp (— E a /kT ) and E a is the energy of the macrostate. Such an 

approach is consistent with various features of the thermodynamics of macroscopic sys¬ 
tems, but also extends to systems that are much smaller. This entire picture follows from 
the principle of equal a priori probabilities, and is a remarkably simple proposition for 
describing the equilibrium behaviour of a complex system. We investigate some of its 
implications in the next few chapters. 


Exercises 

5.1 A system of N weakly coupled quantum oscillators contains Q = 2N quanta. Show 
that the total microstate multiplicity of the system macrostate in which oscillator 1 
possesses q l quanta is given by co{N ,q x ) = (3 N — 2 — q{)\/[(2N — q{)\{N — 2)!]. 

5.2 For cases N = 2, 3 and 4: (a) specify the allowed range of q x , (b) evaluate the 
microstate multiplicity co(N,q { ) for each value of cy in this range and (c) calculate 
the mean and standard deviation of q l , assuming that the principle of equal a priori 
probabilities holds. 

5.3 This question examines the approach towards the canonical ensemble description 
of the properties of a single oscillator, (a) Use Stirling’s approximation to simplify 
In co(N,q { ) for N q x . (b) Expand In co(N, q t ) to first order in q t and take the limit 
N -» oo to show that w(oo,q x ) oc exp (—f>q \). Demonstrate that the /? parameter 
is equal to In (3/2) for this system, (c) Sketch the q l dependence of the weighting 
factors co{2,q l ), a){3,q l ), cn(4,q { ) and a>(oo ,q l ). 

5.4 A system of N = 4 oscillators holding (2 = 8 quanta is initially maintained in 
equilibrium in a macrostate with q x = 4 quanta held by oscillator 1. Calculate the 
Boltzmann entropy of the system in this initial state. The constraint on the value of 
q { is removed, and the system assumes a new equilibrium state. What is the change 
in Boltzmann entropy of the system? 

5.5 Two coupled quantum harmonic oscillators share nine quanta of energy according 
to unspecified dynamical rules. The system is then allowed to exchange quanta with 
a third oscillator. Initially, the third oscillator possesses no quanta. Determine the 
change in Boltzmann entropy once equilibrium has been restored. 

5.6 A system of three oscillators sharing four quanta is coupled to a system of seven 
oscillators sharing eight quanta such that they can exchange energy. Calculate the 
Boltzmann entropy of each separate system, and of the coupled system once it has 
reached equilibrium. Determine the change in entropy brought about by the coupling. 
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5.7 Consider a system of N oscillators sharing Q quanta, with N I. Q > 1 and 
Q — uN, where a is a constant. Use Stirling’s formula to obtain an approximate 
expression for the Boltzmann entropy of the system and show that it is an extensive 
quantity. 

5.8 Calculate the mean and standard deviation of the number of quanta q held by a single 
oscillator in a canonical ensemble when the reservoir parameter is ft — In (3/2). 

5.9 Calculate the mean and standard deviation of the number of quanta Q held by three 
oscillators in a canonical ensemble when the reservoir parameter is /3 = In (3/2). 


6 

The Boltzmann Factor and the 
Canonical Partition Function 


The canonical ensemble discussed in the previous chapter is designed to represent the 
statistics of a system that is open to energy exchange with a large environment. It is 
mathematically easier to employ in practice than the microcanonical ensemble designed 
for isolated systems. Let us now see some examples of its use, where we will also 
encounter the important concept of the canonical partition function. 


6.1 Simple Applications of the Boltzmann Factor 


The Boltzmann factor is exp(— E/kT), where E is the energy of the micro- or macrostate 
in question. It plays a role in the statistical weighting of such a state in a canonical 
ensemble. It can be applied in a variety of ways. 


6.1.1 Maxwell-Boltzmann Distribution 


We consider a gas of atoms of mass m, and regard one atom as the system and the 
remainder of the gas as the heat bath or reservoir. The system exchanges energy with 
the reservoir through occasional atomic collisions. To a very good approximation, the 
system energy is just the kinetic energy of the atom, if interatomic interactions are weak. 
Let us now consider macrostates of the system labelled by the magnitude v of the velocity 
of the atom. We want to determine the canonical probability density function p(y) and 
so we write 


p(v )dv ocd^2(v)exp 



oc p(v ) exp 



( 6 . 1 ) 


For E we have used the kinetic energy mv 2 / 2. The factor df2 (v) is the number of 
microstates of the atom, each specified by a 3-d velocity v, that have a speed in the 
range v to v + dv. It is written as an increment to reflect the fact that it is a multi¬ 
plicity of microstates within an incremental range of a microscopic variable, and has 
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v z 



Figure 6.1 The number of microstates with speed in a range dv about a value v is proportional 
to the volume of a shell of radius v and thickness dv, assuming a uniform density of microstates 
in 3-d velocity space. 


been expressed as p(v)dv using a density of microstates p(v) per unit range of speed. 
Assuming that velocity microstates are to be found with equal density across the 3-d 
velocity space, an assumption that will be justified in Chapter 9 using quantum mechan¬ 
ics, the multiplicity dQ is proportional to the volume 4ttv 2 dv of a shell of radius v and 
thickness dv in 3-d velocity space, as illustrated in Figure 6.1. The density of microstates 
is therefore proportional to 4itv 2 . This produces the celebrated Maxwell-Boltzmann 
speed distribution function: 

7 ( mv 2 

p(v) oc v exp- 

y V 2 kT 

and this may be employed to calculate a number of system properties. 

The mean energy of the atom, for example, is 




(6.3) 
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where Z — v 2 exp(— mv 2 /2kT)dv, such that the pdf is normalised, that is, 

f 0 p(v )dv = 1. After an integration by parts, we find that 



(6.4) 


having noticed that the remaining integral is proportional to Z. Thus the temperature of 
the reservoir controls the mean kinetic energy of the atom. 

Each atom in the gas must have the same statistical properties, so the mean energy 
of a gas of N atoms at temperature T is (3/2)NkT. The constant volume heat capacity 
of such a gas is the temperature derivative of the mean energy when the volume is held 
constant and we find that 


= d(£> 
dr 



(6.5) 


These results may be compared with (2.6) and (2.10) and suggest that averages of 
quantities in statistical thermodynamics correspond to the values of corresponding state 
variables in classical thermodynamics. 


6.1.2 Single Classical Oscillator and the Equipartition Theorem 

Similarly, we make the assumption that the microstates of a classical 1-d harmonic 
oscillator, labelled by position x and velocity v, are distributed with uniform density 
p (x, v) = p 0 over the phase space spanned by these two coordinates. The pdf over these 
variables, according to the canonical ensemble, would then be 


p(x, v) oc p(x, v) exp 


(-^r) 


oc exp 


( KX 2 + 777 V 2 ) 

2kT 


( 6 . 6 ) 


where k is the spring constant of the oscillator, and this distribution provides us with sta¬ 
tistical averages of system properties. The canonical average of system quantity A(x, v) is 

1 f 00 ( 0fX 2 + 777V 2 )\ 

(A) = - J J A(x,v)p 0 exp (-—-Jdxdv, (6.7) 

where Z = f f p 0 exp(— (tcx 2 + mv 2 )/2kT)dxdv. Using the standard Gaussian integral 
exp(—ax 2 )dx = (it/a) 1 / 2 , the normalising factor is 

2nkT \ 2 / 2nkT \ 2 

K / \ 777 / 


Z — Po 


( 6 . 8 ) 
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For example, the average energy, using integration by parts, is 

( kx 2 + mv 2 ) 


(E) = - 
' Z 


- (icx~ + mv )p 0 exp ( -- 


_ 1 
~~ 2 
= kT. 


1 r°° r°° i 

^ J —OO J —OO 2 

H(2s?) 7I“ 2 “ p (- 

+(^)7 

kT (-2YrrTl 


2 kT 


drdv 


mv" exp 


KX 

2kT 
mv 2 \ 
~2kTj 

kx 2 \ 
2kT / 


dx 


J dv 

) dx + kT 


(JH_) 5 /“exp 

\27tkTJ J_ x 



(6.9) 


It is a general result that for each term in the energy that is quadratic in a microstate 
variable, such as x or v, the mean energy in the canonical ensemble is kT /2. We saw 
this in the case of the ideal gas in the last section, when there were three such terms 
(proportional to v 2 , v 2 and v"). This result is called the equipartition theorem, and it 
is very powerful since quadratic terms in the energy are frequently seen. For a 3-d 
oscillator, the energy is (1/2 )k{x 2 + y 2 + z 2 ) + (1/2 )m(v 2 + v 2 + v 2 ), using Cartesian 
position and velocity components, and the mean energy in a canonical ensemble would 
be 3 kT, a contribution of (1/2 )kT for each quadratic term, or equivalently for each 
so-called degree of freedom. 


6.1.3 Isothermal Atmosphere Model 

Now we consider an ideal gas in a gravitational field. Once again, we take the system 
to be a single molecule, and the reservoir to be the rest of the gas. Imagine the gas is 
isothermal, that is, the temperature is uniform with height j. Let us consider macrostates 
of the single molecule labelled by height and we seek a pdf p(z) such that the proba¬ 
bility of finding the single molecule in the height range z —*■ z + dz is p(z)dz with a 
normalisation f Q p(z)dz. = 1. The potential energy of the macrostate is mgz, where m 
is the mass of the molecule and g the acceleration due to gravity. 

The microstate multiplicity of such a macrostate is the number of microstates 
corresponding to different velocities and various states of rotation and internal vibration 
available to the molecule at height z, but this does not depend on height. Thus, 
the ^-dependence of the pdf for the macrostate variable z is simply proportional to 
exp (—mgz/kT). By extension, the density of the gas is proportional to this factor, as it 
will mirror the probability distribution function for the position of a single molecule. 
The gas pressure p (not to be confused here with the pdf p{z)) is proportional to 
density, for an isothermal ideal gas, and hence 

p ex exp (—~^|r) • (6-10) 

Applying this to the atmosphere, the pressure should fall exponentially with height, 
reducing by a factor of e with each ascent through a distance z s = kT / mg. For the Earth, 
this height is about 16 km and is the right order of magnitude, but the real terrestrial 
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atmosphere is not isothermal. The temperature varies with height, and so the pressure 
profile differs from the isothermal profile in practice. 

6.1.4 Escape Problems and Reaction Rates 

As well as providing a basis for detailed calculations of statistical averages, as we have 
seen in the above examples, the Boltzmann factor can be used to estimate the likelihood 
of rare events. Consider a particle in a potential well, but weakly interacting with a 
reservoir of other particles at temperature T. Imagine that the particle can escape from 
the well if it can acquire a threshold energy E t equal to its binding energy. An example 
might be an electron bound to a nucleus but interacting with other particles too. What 
is the thermal ionisation rate? 

The probability that the particle might acquire sufficient energy to reach the escape 
threshold is estimated to be proportional to exp(—E t /kT). This is only an approximation, 
of course, because a system such as this is not in equilibrium. The statistical properties 
are time dependent as the particle is able to escape: as time goes on the probability that 
it remains in the well might go to zero. Nevertheless, the Boltzmann factor provides a 
rule of thumb, and we conclude that a particle with a binding energy of many kT will 
find it difficult to escape by thermal excitation. 

An escape problem on a larger scale concerns the rate of loss of a planetary atmosphere. 
The potential energy of a molecule in the gravitational held of a body of mass M is 
—GMm/r, where G is the gravitational constant, m is the molecular mass and r is 
the radial distance from the centre of the body. If the atmosphere is at a temperature T, 
molecules at a radius r will escape at a significant rate if kT is greater than some specified 
fraction of the molecular gravitational energy. Hot or low mass planets therefore lose 
their atmospheres more quickly than massive or cool planets. 

By a similar argument, liquids should evaporate at a rate that resembles a Boltzmann 
factor. The exponential form of the Clausius-Clapeyron equation for equilibrium vapour 
pressure in (3.73) confirms this expectation. A molecule will escape from the condensed 
phase if it acquires an amount of energy equal to the latent heat of evaporation per 
particle L e . It will condense from the gas at a rate proportional to the pressure of the 
vapour. In equilibrium, there is a balance between the rates of escape and condensation 
and therefore the saturated vapour pressure will be proportional to exp (—LJkT). 

As a final example, consider that the temperature dependence of chemical reactions is 
often found to be proportional to an Arrhenius factor exp (—E c /kT) with some charac¬ 
teristic energy E c . The interpretation, in a similar spirit, is that the reaction requires an 
energy barrier to be surmounted, and that a Boltzmann factor expresses the likelihood 
that the reactants acquire this energy thermally from their surroundings. 


6.2 Mathematical Properties of the Canonical Partition 
Function 

The normalising factor Z that appears in the canonical probability distribution P{E) = 
Z 1 exp (—E/kT) is far more central to the application of the canonical ensemble than it 
might appear. It is so prominent that it has been given a name. It is called the canonical 
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partition function, terminology that expresses its role in describing the partitioning of 
probability amongst the available possibilities. 

Consider a system with microstates labelled ;, each with energy E i . The canonical 
partition function is 

Z = £exp (—££;), (6.11) 

i 

where the sum is over all microstates, and where it is convenient to use the reservoir 
parameter f> instead of 1/kT. The derivative of Z with respect to /i is 

^ = exp(—/!£,.), (6.12) 


and hence the mean energy in the canonical ensemble is 


(E) = j2 E , p ( E .) = ^J2 E ' ex p 


1 dZ 
Z~df 


d In Z 9 d In Z 

-= kT 2 -, 

dyS dr 

(6.13) 


using d/3/dT = — 1/kT 2 . This demonstrates the mathematical convenience of the canon¬ 
ical ensemble and the central role of Z: statistical properties can often be obtained just 
by taking suitable derivatives of the partition function. 

Moreover, the partition function plays a role in making the connection between clas¬ 
sical and statistical thermodynamics. Let us consider the temperature derivative of the 
Helmholtz free energy divided by temperature. Using methods familiar from Chapter 3, 
we have 


1 


1 


d ( - ) = -dF - ^dT = -(-SdT - pdV + pdN ) - 

E p p. 

=--d T - —dV + —dN, 

pi p p 


TS 


T 2 


-dT 


(6.14) 


having employed (3.20), so that 


E = 



3 (F/T) 

3 r 


V,N 


(6.15) 


If we compare this with (6.13), and once again regard the thermodynamic state variable E 
as equivalent to the average system energy in the canonical ensemble, we conclude that 


F = —kT In Z = (E) - TS , 


(6.16) 


which is an extremely important result that we shall meet again in Section 8.2. 
Another powerful result is 

o 2 = <(£ - (E)) 2 ) = (E 2 ) - (E) 2 = exp (-/!£,.) - (E) 2 

i 

1 d 2 Z 1 /dZ\ 2 _ d 2 lnZ _ d(£> _ 2 d(£> 

~ Zdjs 2 ~ zi\dfi) dyS 2 d/T ~ ~dT r ’ 
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that establishes a connection between the variance of the energy and d(E)/dT, the heat 
capacity at constant volume. 

We can show in general that the relative magnitude of thermal fluctuations is inversely 
proportional to the square root of the number of particles. We saw a specific illustration 
of this for the oscillator system in Section 5.4. From the expression for the variance in 
(6.17), we evaluate the ratio of the standard deviation in energy to the mean: 


_ ((E 2 ) - (E) 2 )2 
( E) (E) 



(6.18) 


Both the heat capacity d(E)/dT and the mean energy ( E) are proportional to the number 
of particles in the system and hence are extensive: if the system size is doubled, they 
both double. Inserting such dependence, it is clear that the ratio in (6.18) is proportional 
to N~ l/2 . The mean and the standard deviation both increase as the thermodynamic limit 
is approached, but in a relative sense, the distribution over energy becomes sharper. The 
thermodynamic limit is characterised by the complete neglect of fluctuations. 

Note, however, that there are circumstances where statistical fluctuations can become 
apparent even in macroscopic systems. An example is the phenomenon of critical 
opalescence of a fluid near its so-called critical conditions of pressure and tempera¬ 
ture. These conditions correspond to the right hand end of the gas-liquid coexistence 
line in Figure 3.8, where the distinction between a gas and a liquid is lost. Near this 
point, the fluid experiences strong local fluctuations in fluid density, since the surface 
tension characterising the interface between such patches becomes very small, giving 
rise to variations in optical properties and a consequent cloudiness in the fluid. Equation 
(6.17) would suggest that the heat capacity near the critical point becomes anomalously 
large as well, and this is borne out experimentally. 

In the next few sections, we study several examples of the role of the partition function 
in establishing the statistical properties of simple systems. 

6.3 Two-Level Paramagnet 

One of the simplest systems that can be studied in a canonical ensemble is a two-level 
paramagnet. A paramagnet is a material that can acquire a magnetisation when exposed to 
an external magnetic field, but loses it when the field is turned off. The basic element of 
the material is an atomic magnetic dipole that can orient itself with respect to the external 
field. Quantum mechanics tells us that the number of possible dipole orientations is finite, 
not continuous, and in the very simplest case, there might be just two microstates labelled 
by m s : the dipole can be aligned with (m s = +1) or against (m s = — 1) the field. The 
microstates are characterised by magnetic dipole moments m s pr B and energies — m s pr B B , 
where B is the magnitude of the external magnetic field and p. B = eh/2m e is the Bohr 
magneton, with e the elementary charge and m e the mass of the electron. 

We are interested in the mean magnetisation M of the paramagnet. If there are N 
dipoles in the material, then M = Nfi B (m s ), where the brackets denote a canonical 
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average under the influence of a heat bath at temperature T. The partition function is 


z -s>(-i)-JH 5 £ £ ) 

=“ p ('w ) +exp (~f )= 2cosh (if)' 

The average may be constructed as 



but the same result can be obtained by differentiating the partition function: 

1 / m^p„B\ 1 dZ 2sinh (p B B/kT) 

* V kT ) Zd(p B B/kT) 2cosh (p B B/kT) 


= tanh 



( 6 . 21 ) 


Thus, (m s ) — 0 if B = 0 (the absence of an external field), but for a strong field such 
that \B kT/p B , the dipole is fully aligned in the direction of the field, corresponding 
to |(m J )| —> 1. For |fl| <£ kT / p B , tanh(/x B fl /kT) p B B/kT and the paramagnetisation 
of the material is then approximately 


Np 2 B B 

kT 


( 6 . 22 ) 


This linear proportionality between magnetisation and external field, and inverse propor¬ 
tionality to temperature, is a well-established experimental result known as Curie’s law. 
The mean magnetisation of the two-level paramagnet over a range of fields is illustrated 
in Figure 6.2. 



Figure 6.2 Magnetisation M of a two-level paramagnet as a function of external magnetic field 
B. Linear dependence at low external field goes over to saturation at higher fields. 
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6.4 Single Quantum Oscillator 

We based our discussion of statistical thermodynamics on quantum oscillators in Chapter 
5, and it is instructive to consider now the canonical statistical properties of a 1-d quantum 
harmonic oscillator using the partition function. The energies of the microstates are 
E n = [n + (1/2 )]hcn, such that the partition function is 


OO 

Z = exp (-|) ^exp(-tu), 

n =0 


(6.23) 


where x = iuo/kT. This is a geometric series, and sums to 

z exp(-f) 1 

1 - exp (— x) 2 sinh (f) ’ 

and the mean vibrational energy £ vib is given by 


(6.24) 


(E vlb ) = - 


1 dZ 

Zdp 


hco dZ 
Z dx 


= 2 hco sinh 



2 cosh(i) 1 

(2 sinh (i)) 2 2 


hco 

2tanh(^)' 


(6.25) 


This has the correct behaviour at high and low temperature: it is equal to the zero 
point energy tuo/2 for T <C iuo/k, where thermal excitation is very weak, and goes 
to the classical equipartition result (£ vib ) = kT as T —> oo, as shown in Figure 6.3. 
The agreement with the equipartition result obtained for a single classical oscillator in 
Section 6.1.2 suggests that the constant density of states across the (x,v) phase space 
used in that derivation was appropriate. 



Figure 6.3 Mean energy of a 1-d quantum oscillator as a function of temperature. The classical 
result kT is obtained for T > hio/k. Below this temperature, the mean energy tends towards the 
zero point energy of the oscillator hco/ 2. 
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6.5 Heat Capacity of a Diatomic Molecular Gas 


The classical and quantum limits of the canonical statistical behaviour of an oscillator can 
be observed in the temperature dependence of the heat capacity of a diatomic molecular 
gas such as 0 2 . The vibrational motion of the two atoms with respect to one another may 
be modelled as a 1-d quantum oscillator. Using (6.25), the heat capacity of molecular 
vibration is 


d(E vlb ) 

dr 


hco d 

Tdr 



hW 

4/tr 2 sinh 2 (Igr)' 


(6.26) 


The classical expression emerges for T ho>/k , in which case, we can write ho>/2kT <<y. 
1 and sinh(ha)/2kT) Ha)/2kT, such that C vib —* k. In the other extreme, T Hco/k 
and sinh(hco/2kT) ^ (1/2) exp(hw/2kT) and so C vib —»■ 0. 

This behaviour is apparent in experimental data for diatomic molecules such as 
0 9 , illustrated in Figure 6.4. On the basis of the equipartition theorem considered in 
Section 6.4, a contribution of k/2 per degree of freedom would be expected. There 
are three components of centre of mass linear momentum, two degrees of freedom for 
rotation of the molecule about the centre of mass (a third does not arise for a diatomic 
molecule) and then two oscillator degrees of freedom from the vibrational motion, one 
each for the potential energy and kinetic energy contributions. Therefore, we would 
expect to see a constant volume heat capacity per molecule of lk/2, and indeed at 
temperatures above 1500 K for 0 2 , this is what we see experimentally. However, as 
the temperature is reduced, the heat capacity falls to 5k/2, as sketched in Figure 6.4. 
This is taken as evidence that the mean energy due to vibration is suppressed at lower 
temperatures corresponding to the approach to the quantum limit. This is referred to 
as the ‘freezing out’ of the vibrational degrees of freedom, and the temperature T’vib 



Figure 6.4 Heat capacity of a diatomic gas per molecule at constant volume as a function 
of temperature. Below a temperature ^vib’ the vibrational degrees of freedom are progressively 
‘frozen out’ until the vibrational energy of molecular oscillation is dominated by the temperature- 
independent zero point energy. Below a temperature T mV rotational degrees of freedom are also 
frozen out. 
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below which this sets in should correspond to the condition ha>/kT vib ~ 1, where o> is 
the vibrational frequency of the molecular bond. 

Furthermore, below a temperature 7 mt , there is a suppression of the heat capacity 
from 5k/2 to 2>k/2, which is interpreted as the freezing out of the rotational degrees 
of freedom, which is explored in an exercise at the end of this chapter. The early 
pioneers of statistical thermodynamics were very concerned at the apparent failure of 
the equipartition theorem for diatomic gases; the anomalies actually provided evidence 
for the quantisation of energy. 

The separation of the total energy of a systems into translational, vibrational and rota¬ 
tional energy has the important implication that the partition function can be factorised. 
Consider a complex molecule with an energy that separates as 

E ~ 'E’trans + E v\b + E mt + E elec’ (6.27) 

where a term for the energy associated with the electrons is included as well. The partition 
function for such a molecule is 

z = £ “-(-#) 

microstates 

=E“p (-^r) i>p (-1?) £“p (-1?) x>p (-!r) 

trans x 7 vib v 7 rot v 7 elec x 7 

= ^trails ^vib ^rot ^elec (6.28) 

as long as the energies of the microstates of vibrational motion do not depend on the 
value of the translational energy and so on. The factorisation of the partition function 
when describing independent modes of excitation appears in other contexts later in this 
book. 


6.6 Einstein Model of the Heat Capacity of Solids 


The analysis of a quantum oscillator can be used to understand the heat capacity of 
a solid. Albert Einstein (1879-1955) proposed a model of the vibrational energy of a 
solid based on the idea that each particle vibrates about its rest position at the same 
angular frequency co E , known as the Einstein frequency. Each particle therefore may be 
represented by a 3-d quantum oscillator. The vibrational frequency spectrum of a solid 
is much broader and richer than this, but the single frequency approximation makes the 
analysis quite easy as each particle makes the same contribution to the heat capacity. 

Using (6.26), the Einstein vibrational heat capacity of a solid consisting of A particles is 


3 Nh 2 u>l 

4^ 2 sinh 2 (^)’ 


(6.29) 


which goes to 3 Nk for T T E = ho) E /k, where T E is called the Einstein temperature. 
C E approximates to the form 3Nk(T E /T 2 ) exp(— T E /T) for T T E , as illustrated in 
Figure 6.5. Although experimental data deviates somewhat from this particular expression 
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Figure 6.5 Temperature dependence of the Einstein vibrational heat capacity of a solid of N 
atoms, each oscillating at a common frequency co E related to the Einstein temperature T E — ha> E /k. 


at low temperature, the rough agreement was used by Einstein as evidence that the 
vibrational energy of a solid is indeed quantised, and that it progressively becomes 
frozen out as the temperature is decreased. Such an application offers strong support for 
the validity of statistical thermodynamics. 


6.7 Vacancies in Crystals 


Our next example concerns the presence of defects in crystals known as vacancies. In 
a perfect crystal, the N atoms form a precisely ordered spatial array called a lattice. 
But real crystals will contain the occasional unoccupied lattice site, and these are called 
vacancies. The displaced atom can either squeeze in between its neighbours somewhere, 
when it is called an interstitial atom, or migrate to the crystal surface or some other 
interface. 

Let us make things simple and consider each atom to have two available positions: in 
its proper place with energy zero, or displaced elsewhere with positive energy E v . The 
configuration of the atoms, and hence of the crystal, is a microstate labelled by the state 
of occupancy of each site in the lattice. The partition function is 


Z = X! ex P 


/ J2 n j E v\ 

_ j 

icT 

V 


(6.30) 


where the set {« 1 , n 2 , n^, ■ ■ ■ , n N } denotes the state of occupancy of each of the N sites: 
rij = 0 means site j is occupied and rij = 1 means that it is vacant. The energy of the 
configuration is therefore Ylf=i n j E v ■ The partition sum is over microstates {rij}, that is, 
over all possible values of all the rij. 
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This model is a useful illustration of how the statistical properties of a system of 
independent components can be regarded as the aggregate of the statistical properties of 
each component. As the energy is a sum of separate contributions, we can factorise the 
partition function as follows 



and evaluate each part. The first is just [l + exp (—E v /kTj^. In fact, every factor is the 
same; so the partition function is simply 


Z = 


1 + exp 



(6.32) 


We can interpret N v = E ; /? ( as the total number of vacancies in the microstate 
{rij}. The mean number of vacancies (N v ) is therefore given by (E ; ) = 

) (£;»;) exp{-'£ j n J E v /kT), but considering (6.30), this is equivalent to 
—din Z/dy where y — E v /kT. Thus 

(K) = -^-ln (1 + exp(-y)) A ' = -N In(1 + exp(—y)) = - N . (6.33) 

d - v d >’ I +exp(|) 


Let us consider the implications of this expression. As the temperature is raised, (IV,,} 
increases towards an upper limit of N/ 2. In contrast, as T —> 0, we find that (IV,,} ~ 
N exp (— E v / kT ) such that vacancies become rare as the temperature is reduced. Notice 
that this result is in accord with the rule of thumb that the Boltzmann factor exp (—E v /kT) 
determines the likelihood of a rare event such as vacancy formation at low temperature. 

The mean energy of the crystal associated with its defects is the mean number of 
vacancies times the energy of vacancy formation, ( N V )E V . The contribution to the heat 
capacity of the crystal is therefore d((N v )E v )/dT. This is 


C 


V 


NE[ 

kT 2 


[ 


exp (^) 

! + exp ( pf) 


2 


NE[ _ 1 _ 

kT ~ [exp (-^)+exp(^)] 2 


NE[_ 

4 kT 2 


sech 2 



(6.34) 


This supplements the heat capacity due to vibrations of the crystal discussed in 
Section 6.6. 

As the temperature increases, expression (6.34) goes through a peak at around T = 
E v /k, as sketched in Figure 6.6. As T rises, the increasing value of sech 2 (E v /2kT), 
which behaves like exp (—E v /kT) for T « E v /k, is countered by the decreasing value 
of r~ 2 . However, as a typical vacancy formation energy is around 1 eV, the peak will lie 
at T ~ 1.6 x 10 l9 /1.38 x 10 23 ~ 10 4 K, and the crystal will have melted before then. 
Nevertheless, any system containing components that each have two possible microstates 
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Figure 6.6 Heat capacity C v of a crystal due to a temperature-dependent mean population (7V v ) of 
vacancies. The peak is known as a Schottky anomaly, which might be visible above the monotonic 
temperature behaviour of the heat capacity arising from solid vibrations sketched in Figure 6.5, 
assuming that the solid has not melted. 


will behave in a similar manner, and as long as the energy scale is suitable then such 
a peak, known as a Schottky anomaly, will be observable above the vibrational heat 
capacity background. For example, the two-level paramagnet discussed in Section 6.3 
has a mean energy of (E) = —Np B B tanh (p B B/IcT) for N dipoles, and a heat capacity of 



(6.35) 


which takes the same form as (6.34). The temperature Tm = /ijjB/k for a magnetic 
held of order 1 tesla is a few kelvin, and Schottky anomalies can be readily detected in 
paramagnetic materials at such temperatures, providing further confidence in the methods 
of the canonical ensemble. 


Exercises 


6.1 The canonical partition function of a classical 1-d harmonic oscillator of mass m 
and spring constant k may be written as 



where x and p are the oscillator position and momentum, respectively, and h 0 
is a constant, (a) Evaluate Z and hence the mean energy ( E } of the oscillator 
in equilibrium with a heat bath at temperature T. You may assume that 
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f 00 ^ exp(— oix 2 )dx = (jr/a) 1 / 2 . (b) Calculate the Helmholtz free energy of the 
oscillator, and show that the entropy is given by S = k( I + In [ 2nkT/(h 0 co) ]) where 
co = (k/ m ) 1 / 2 is the natural angular frequency of the oscillator, (c) Demonstrate that 
the oscillator entropy does not satisfy the third law of thermodynamics and explain 
physically why this is so. 

6.2 The canonical partition function of a 1-d quantum harmonic oscillator is 


Z = [2smh(hco/(2kT))T l . 


(a) Evaluate the mean energy of the quantum oscillator when in equilibrium with 
a heat bath at temperature T. (b) Evaluate the Helmholtz free energy and entropy 
of the quantum oscillator, (c) Does the oscillator entropy satisfy the third law of 
thermodynamics? (d) By considering the high temperature limit of the partition 
function of the quantum oscillator, identify the constant h 0 employed in the classical 
treatment in the previous question. 

6.3 The spring constant of a classical harmonic oscillator is very slowly changed from 
k to 2k while the oscillator remains in thermal equilibrium with a heat bath at 
temperature T. Making use of results obtained in question 6.1: (a) Calculate the 
change in mean energy of the oscillator, (b) Calculate the change in entropy of 
the oscillator, (c) Calculate the heat delivered to the heat bath during the process, 
(d) Calculate the change in Helmholtz free energy of the oscillator, (e) Calculate 
the work done on the oscillator during the process, (f) If the process were repeated 
more rapidly, would your answer to part (e) be greater than, less than, or the same 
as in the very slow process? 

6.4 The magnetic moment of an atom may take two orientations with respect to an 
external magnetic field B: aligned with the field, with energy —or against the 
field with energy +/j. b B. (a) Calculate the canonical partition function of an array 
of N atoms in equilibrium with a heat bath at temperature T. (b) Show that the mean 
energy of the array is given by (E) = —Nh b B tanh(fj, B B/kT). (c) Show that the 
entropy of the array is given by 

S =—Nfi B B/T tanh(/jt B B/kT) + Nk ln(2cosh(/u, B B /kT)). 

(d) Evaluate the entropy of the array for B = 0 and B —> oo. (e) Show that the 
standard deviation of the energy is given by cr E = fi B BN 1 ^ 2 / cosh(/x B B /kT). 

6.5 Demonstrate that the following expression 



is equal to the Gibbs free energy of a system. 

6.6 The rotational energy of a molecule is quantised according to E ml (£) =1(1+ 1)0 
where © is a constant, and with £ = 0, 1, 2 and so on. The number of rotational 
quantum states at a given value of £ is 21 + 1. Write down the rotational canonical 
partition function of the molecule as a sum over £. State the probability that the 
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molecule might be found with a particular value of l. At high temperatures, it 
is possible to regard the general term in the sum as a function of a continuous 
variable l. Show that the most probable value of i in these circumstances is 

2kT \2 

~&) “ J ’ 

and interpret this in terms of the suppression of rotational energy as the temperature 
is reduced. 


f — _ 

^ mode rs 




7 

The Grand Canonical 
Ensemble and Grand 
Partition Function 


The behaviour of a system in contact with a heat bath has been investigated using 
the ideas of statistical thermodynamics in previous chapters and we have developed 
a scheme, the canonical ensemble, that provides a framework for a treatment of such 
behaviour. The next step is to develop the statistical thermodynamics of a system in 
contact with a reservoir that provides particles as well as energy. The ensemble of 
system microstates that emerges is designed to capture the statistical properties of a 
system coupled to a heat and particle bath. It is called the grand canonical ensemble. 
This is very suitable for a description of coexistence between phases such as a gas and 
a liquid. We shall also find that vacancy formation in crystals, discussed in Section 6.7, 
can be naturally treated with these methods. 


7.1 System of Harmonic Oscillators 

As in Chapter 5, it is instructive to use an example involving quantised harmonic oscil¬ 
lators as the basis for discussion. Consider a set of N tot harmonic oscillators holding Q ioi 
energy quanta. Our system of interest is now a variable size group of these oscillators. 
How it might be that oscillators are counted as in or out of the group is not important, 
but the idea is akin to a situation where people associate together, and make or break 
their social bonds to each other in a rather complicated way. A physical example that 
might help intuition is to imagine the oscillators floating around inside a container, and 
as they pass in and out of a defined subvolume, we regard them as entering and exiting 
our system of interest. The ensemble we plan to develop would tell us the likelihood 
that the subvolume should contain a certain number of oscillators. 

Alternatively, we could be more abstract and simply consider the creation and removal 
of network connections between the oscillators, without their having to move, and 
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Figure 7.1 A system of oscillators is defined by a set of connections between them that appear 
and disappear as time evolves according to given rules. In the illustration, the system grows in 
size from N = 5 to N = 6. The disconnected oscillators form the reservoir. If the oscillators 
also exchange quanta of energy, then we have a framework for describing a system in contact 
with a heat and particle bath. The grand canonical ensemble is the collection of all available 
conformations of the system. 



imagine that the pattern of connections evolves in time in some complex manner. 
A system would then correspond to the set of interconnected oscillators, and the oscilla¬ 
tors that are not part of the network would be regarded as components of the reservoir. 
The dynamics that modify connections would run alongside the dynamics of the exchange 
of quanta between oscillators considered in Section 4.2. Such a network is illustrated in 
Figure 7.1, undergoing a change in size as a new connection is made. 

The system moves from microstate to microstate in a complicated phase space that 
is now labelled by variable N as well as the distribution of quanta , q N }. We 

need to establish the statistical weighting of these system microstates. We expect to find 
that the average number of oscillators in the system, as well as the average number 
of quanta, will depend on the properties of the reservoir, and we would also like to 
construct an analogue of the chemical potential of the reservoir. Fluctuations around the 
mean population will emerge from the statistical treatment. 

We start as in the canonical case by considering the microcanonical ensemble of the 
system plus the reservoir. We define macrostates of this larger phase space labelled by the 
number of oscillators N in the system and their total number of quanta Q. The microstate 
multiplicity of a macrostate would be 

C2(N,Q)n r (.N tot -N,Q m -Q), (7.1) 

formed from the product of the separate microstate multiplicities of system and reservoir 
when characterised by parameters (N , Q ) and (/V tot — N , <2 tot ~ G), respectively. As in 
Section 5.3, we have denoted the multiplicity of the reservoir using suffix r for added 
clarity. 
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If N <$; N tot and Q Q lot , we can expand In Q r to first order in Q and N: 

3 In Cl T (N loV Q tot ) 


In Q r (N tot - N, Q tot - Q) « In f2 r (lV tot , Q lot ) - Q - 


3 6,c 


■N 


3 In £2 r (iV tol , Qipt) 
3 N, nl 


We again define 


fi = 


3 In Ci r (N to[ , Q tot ) 

3 Qtot 


and now introduce a parameter jl such that 

3 In Q r (N l0V 2 to t) 


frP = 


3 M, 


For lV tot 1 and Q tot 1, we can employ (5.3) and write In f2 r (/V l0P (2 tot ) 
A^tot) ln (Qtot + AI tot ) - Qtot ln Qtot - A/tot In A! t o,. Then, we find that 

$ = ln(<2 tot + lV tot ) + 1 - In Q lot - 1 = ln (l + ^ t ) , 

\ ^tOt / 


as in (5.12), and 

A =- r ' in ( i+ l:)= 



(7.2) 


(7.3) 


(7.4) 


(Qtot + 


(7.5) 


(7.6) 


The parameters j3 and fi are clearly properties of the combined system and reservoir, as 
they depend on N tot and Q m . If the number of oscillators and quanta in the reservoir are 
typically much larger than those in the system, we may also regard them as properties 
of the reservoir on its own. Clearly, we intend the jl parameter to represent the chemical 
potential of classical thermodynamics, though since energy in our example is measured 
in units of hu>, both fi and fi. are dimensionless. It is interesting to notice that fi is 
negative, in line with the claims made in Section 2.11. 

In this case, the parameters fi (inverse temperature) and fi (chemical potential) are not 
independent of each other, in contrast to the independence of T and /x for a physical heat 
and particle bath in classical thermodynamics. The reason for this is that the reservoir is 
specified by only two parameters, N lol and Q lol , and we lack a third parameter, analogous 
to the volume occupied by a gas, to add further flexibility. Nevertheless, the example is 
still instructive. 

We now see from (7.2) that the microstate multiplicity of a macrostate characterised 
by N and Q can be expressed as 


^(N,Q)^(N tot -N,Q tot - Q) « n(N,Q)S2(N lot ,Q lot )ex p (~fi(Q - (IN)), (7.7) 

having used (7.2-7.4). This provides the statistical weighting for an ensemble of 
all system macrostates. This is illustrated for a range of N and Q with fi — ln(3/2) 
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Figure 7.2 Surface indicating the grand canonical weightings of (TV, (?) macrostates of a system 
of oscillators in contact with a heat and particle bath at $ = ln(3/2) and p = — In 3/ ln(3/2). The 
ranges chosen are 0 < Q < 40 and 0 < TV < 10. The peak rises further around TV = 0, Q = 0, but 
is cut off for clarity. 


and jip — — In 3, arising from an underlying choice <2 tot = 27V tot , in Figure 7.2. The 
distribution over quanta Q is peaked, as we might expect from the considerations in 
Section 5.4, and the distribution over TV falls steadily. The probability distribution over 
the ensemble of system macrostates is 

P{N,Q) = , Q) exp - £7V)), (7.8) 


where we have introduced a new kind of partition function playing the role of the 
normalising factor in the probability distribution. It is related to a sum over canonical 
partition functions Z(/3,TV) of systems with fixed TV: 


Z G (/2,/3) = ^^Q(Ar,<2)exp [~$(Q - (IN)] 

N Q 


(7.9) 


= E^ exp 

N 


^£2(TV,0)exp (~$Q) 
Q 


^ZG6,7V)exp (£L@N). 

N 


The ensemble of system macrostates labelled by TV and Q could also be written 
as an ensemble of system microstates labelled by the specific populations of quanta 
{?t,? 2 >''' ’In} residing in the system oscillators, in which case we would write 

P(N, {, qj }) = Z G 1 exp (~$(Q - £TV)), (7.10) 

and 

z g = EE ex p (-kQ-pN)), (7.ii) 

N (9/1 


where Q = V, ■ The ensemble is a collection of copies of the system, prepared in 
every possible microstate, where both the number of quanta and the particle number can 
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vary. It is called the grand canonical ensemble and Z G is the grand partition function: a 
summation of the so-called Gibbs factor exp (—p(Q — pN)) over all microstates. It is a 
function of the dimensionless reservoir parameters (i and fi. 

The principal role for the grand canonical ensemble is to enable us to understand how 
the reservoir chemical potential controls the mean number of particles in a system, and 
how that number might fluctuate. Just as we discussed in Section 5.4, we expect there 
to be a thermodynamic limit where the distribution over particle number in the system 
has a sharp peak. The expression in (7.8) is a product of Ci(N,Q), which is a rapidly 
increasing function of N, and, recalling that p < 0, a decreasing function cxp( pN). 
The location of the peak with respect to N increases as ft increases, or more precisely 
as ft is made less negative, when the rate of exponential decrease in the second factor 
is weakened. 

For example, considering Q to be constant, we have E2(N, Q) ~ N® for N Q from 
(5.5) and the peak in Q (/V, Q) explfSpN ) lies at N = N*, found from the condition 

“j~- In (£2 (IV, <2) exp 0fiN)) « ^(G In AT + $pN) = ^ + fft = 0, (7.12) 

such that N* oc —p 1 . In the thermodynamic limit brought about by fi —> 0 from below, 
in which N* becomes macroscopic, the standard deviation in the particle number grows 
less quickly than the mean, such that fluctuations about the mean essentially disappear. 


7.2 Grand Canonical Ensemble for a General System 


Having introduced the concept of the grand canonical ensemble using the example of the 
system of oscillators, we now generalise. We consider systems specified by macrostate 
variables energy E and number of particles N, and propose the following equilibrium 
probability distribution over macrostates: 

1 / (E - pN)\ 

P(N,E) = —£2(N , E) exp y — V ^ j , (7.13) 

where the grand partition function is 


Z G (/x, T) = ex P 


microstates 


(E-pN) \ _ 
kT ) 


= Y, Y< exp 


(E - pN) 
kT 


(7.14) 

The system is considered to be coupled to a reservoir with a temperature T and a 
parameter //, with dimensions of energy, defined as 

,9 In Q r (N r ,E r ) 


p = -kT- 


dN r 


(7.15) 


Using Boltzmann’s expression for the entropy S = k In Q, this definition is consistent 
with that of the chemical potential p — —TdS/dN in classical thermodynamics, given 
by (2.48). 

The main purpose of the grand partition function, as we found with the canonical par¬ 
tition function, is that it allows ensemble averages to be obtained by differentiation. The 
mean number of oscillators in the system, for example, is a weighted sum of macrostate 
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variable TV over the grand canonical ensemble: 


(N) = EE NP(N,E ) 

N E 


1 

Zg 


EE TV £2 (TV, E) exp 

tV £ 


1 

E 


9(jf)/ r V 3M 


T 


^ (E-pN) ^ 


(7.16) 


where we employ partial differentiation since Z G is a function of both /x and T. We shall 
use the grand canonical ensemble, and this result in particular, in Chapters 11 and 12 
when we examine quantum gases. We shall find that the grand partition functions for 
such cases are actually quite straightforward to evaluate. But as a preliminary example, 
let us examine once again the problem of vacancies in crystals. 


7.3 Vacancies in Crystals Revisited 


In Section 6.7, we determined the mean population of vacancies at a lattice site using 
a canonical ensemble. A vacancy is not a particle, but rather the absence of one, but 
nevertheless with some bending of the concepts, we can approach the same problem 
using the grand canonical ensemble. 

We consider a system consisting of a single lattice site and imagine it to be exposed to 
a reservoir of vacancies at a chemical potential //, in addition to a heat bath at temperature 
T. Conceptually, the rest of the crystal acts as a gas of vacancies, in a state of motion as 
atoms hop between the various lattice sites, and at a certain concentration and chemical 
potential determined by the temperature and the vacancy formation energy E v . 

Let us construct the grand partition function for the single lattice site. There are only 
two values of the population TV of vacancies in the system, namely zero and one, and 
for each population, the energy takes just one value, zero and E v , respectively. There 
are just two microstates and we write 

z "= L “P(- a ) = 1 + “P(— CM) 

microstates v 7 

The mean number of vacancies per site then follows: 

lnZ G \ (WT 1 exp (-<*=*>) 

--) = kT --A-rZ= -7---. (7.18) 

^ i+exp(—M) exp (M) + 1 


(TV) = kT 


This is similar to (6.33), but the appearance of /x in (7.18) needs further comment. It is 
inserted since we are employing the concept of a vacancy as a real particle that can be 
injected from or lost to a reservoir. In Section 6.7, we were calculating the total energy of 
a system with a fixed number of real particles, but we interpreted the outcome in terms of 
the occupation of lattice sites and hence the population of vacancies. The grand canonical 
approach is a more powerful approach than this since there are occasions when a crystal 
can effectively be exposed to sources and sinks of vacancies, such that the population 
of vacancies might be changed without altering the temperature. Essentially, atoms can 
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migrate out of the system, controlled by the properties of the environment, and the idea 
of a chemical potential for the supply of vacancies is a useful one. But in order that the 
previous result (6.33) might be recovered, we must conclude that the chemical potential of 
vacancies in an isolated crystal is zero. A similar outcome in different circumstances will 
appear in Chapter 13 when we consider the statistical thermodynamics of electromagnetic 
radiation, and its associated quantised particle, the photon. 


Exercises 


7.1 A system is in contact with a heat and particle bath at temperature T and chemical 
potential /x. If there are only two microstates available to the system, one with 
N = 0, E = 0; and the other with N = l, E — e > 0, derive an expression for the 
grand partition function and show that ( N) = [exp((e — fx)/kT) + l] -1 . Sketch the 
dependence of (N) on the bath chemical potential in the range — oc < /x < oc. 

7.2 Show that the variance in the population of a system er^ = (N 2 ) — (N) 2 is given by 



Determine <j n for the case in question 7.1, plotting it in the range —oo < /x < oo. 

7.3 A system can hold zero, one or two particles, and a particle in the system can take 
an energy of zero or E p — kT In 2. Calculate the grand partition function and hence 
the mean particle number for a given /x and T. 

7.4 Determine the relationship between the mean energy of a system and the grand 
partition function. 



8 

Statistical Models of Entropy 

Several chapters have passed without much mention of entropy, and 
as this book is an entropic approach, it is time to put this right. 

It is quite acceptable to use the tools of statistical ensembles to 
study system behaviour without delving more deeply into expla¬ 
nations, just as it is feasible in quantum mechanics to master the 
mechanics of performing calculations without necessarily enquiring 
deeply into the physics of quantum phenomena. In statistical ther¬ 
modynamics, the equally deep matters are the justification for the principle of equal 
a priori probabilities, and the relationship between statistical ideas and the concept of 
thermodynamic entropy. This is where most of the trouble in understanding the subject 
lies, and in this chapter, we discuss the various views that have developed. 

In Section 4.4, we briefly discussed Boltzmann’s insight that entropy is a measure 
of the number of underlying microstates that are compatible with a macroscopically 
measurable state. This is the key principle of statistical thermodynamics, but it does not 
quite stand up to detailed scrutiny for systems, coupled to an environment, that could 
potentially access an infinite number of microstates. We also referred to the possibility 
that Boltzmann’s expression was actually a statement of a connection between entropy 
and microstate probabilities, in the context of an isolated system where the probabilities 
were to be specified by the principle of equal a priori probabilities. We now investigate 
this underlying connection further, which of course could raise the question of what we 
mean by probability, matters that we discussed in Section 4.1. The outcome of such 
considerations is a multifaceted view of entropy, a collage of mutually supportive ideas, 
some of which can extend to systems out of equilibrium. 



Caution: Entropy 


8.1 Boltzmann Entropy 

Boltzmann’s model of entropy was introduced in Section 4.4, and we repeat it here with 
a suffix B for clarity: 

S B =kln£2, (8.1) 
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where Q is the total multiplicity of microstates available to an isolated system. Q depends 
on system parameters that are conserved by the dynamics, such as energy and the num¬ 
ber of particles. It is often sensible to divide the available microstates into groups, each 
characterised by a common feature that is not dynamically conserved, and therefore 
changes with time, and to refer to the groups as macrostates. The idea of a macroscopic 
state in classical thermodynamics maps onto the concept of macrostates in statistical 
thermodynamics. The multiplicity El u (N.. E) of such a macrostate depends on the con¬ 
served quantities, and on the value of a non-conserved quantity a that serves to define the 
macrostate, such as the oscillator macrostate spikiness that we encountered in Section 4.3. 

8.1.1 The Second Law of Thermodynamics 

The second law of thermodynamics emerges in statistical thermodynamics if we set up 
an isolated system under an initial constraint that it should take only a certain range of 
values of a property a , and then remove this restriction on the dynamics. If the system 
remains isolated, the parameters N and E do not change, but the range of available values 
of a is typically increased, and after a relaxation period the probabilities that the system 
might take values of a are assumed to be proportional to the microstate multiplicity Q u 
of the associated macrostate. 

In such discussions, the concept of a macrostate can be employed in two ways: in 
a narrow and in a broad sense. We can state that the system is in a macrostate where 
it takes a specific value of the non-conserved quantity a, but the collection of all such 
macrostates can also be viewed as a macrostate, this time labelled by the whole range 
of available values of a and corresponding to the complete accessible phase space. So 
in a thermodynamic process, we start off in one macrostate, labelled by the initial range 
of values of a. We release the constraint on a, allowing the system to visit new (narrow 
sense) macrostates that were inaccessible before. Eventually the system settles into a 
new equilibrium, taking a final (broad sense) macrostate labelled by the entire range of 
a made possible under the new dynamics. 

This is illustrated in Figure 8.1, where we see a phase space divided into white 
regions characterised by different values of a. The initial macrostate might be one or 
more of the white regions, and after a release of a constraint the final macrostate would 
be the entire white sector. Beyond that are pink regions that are inaccessible under 
the remaining dynamical constraints. As the microstate multiplicity of the final (broad 
sense) macrostate is clearly larger than the microstate multiplicity of any initial (narrow 
sense) macrostates, Boltzmann’s expression clearly explains the second law. This is 
essentially the same argument we gave in Section 4.4. 

As entropy in thermodynamics is traditionally regarded as an equilibrium property, 
Boltzmann’s expression should perhaps only apply before the constraint is removed, or 
some time in the future when the statistical properties of the system have become time 
independent. These situations are where the principle of equal a priori probabilities is 
supposed to apply, and so Boltzmann’s expression could also be taken to be a connection 
between entropy and the equilibrium probabilities of microstate occupation, which we 
shall explore later. 

Boltzmann proposed that the dynamics of particles in a gas, for example, were capable 
of establishing equal, time independent microstate probabilities. An extreme case would 
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Figure 8.1 A white phase space that is accessible to a system, divided into macrostates corre¬ 
sponding to values of a non-conserved quantity a. The area of each patch is a representation of 
its microstate multiplicity. 

be that the dynamics should allow a system in equilibrium to visit every microstate for 
an equal length of time. This is called the assumption of ergodicity: unfortunately, it 
cannot be proved in general, and would appear to be a rather demanding requirement 
for a finite measurement interval. So in Boltzmann’s approach, the fundamental axiom 
is that the dynamics of an isolated system are ergodic, or something approaching it. 

It is possible to extend these dynamical ideas to the nonequilibrium behaviour seen 
just after the release of the dynamical constraint. We can define the microstate multi¬ 
plicity associated with each of the macrostates a, and we could take the view that 
S a = k In fl (y represents something similar to entropy as the system moves through the 
macrostates during its relaxation towards equilibrium. As a system explores its phase 
space, it passes from (narrow sense) macrostate to macrostate, and so the value of S a 
changes. Although some directions of travel in phase space can conceivably lead to a 
decrease in the value of S a , most lead to its increase, as long as the dynamics are of a 
kind that explores phase space broadly and efficiently. Such a picture gives us a quantity 
that tends to increase after the release of a constraint, and to reach a peak at a value 
corresponding to the macrostate with the largest microstate multiplicity, thereafter tak¬ 
ing only rare excursions into neighbouring macrostates accompanied by decreases in S a . 
This resembles the presumed developing behaviour of a nonequilibrium entropy. 

Using a geographical analogy, it is like starting a journey from London that involves 
a complicated schedule of flights to destinations all over the world. During the journey, 
a traveller looks out of the window and sees water. The chances are that she will be 
over the Pacific Ocean, as it is the largest in area. Sometimes she might see the Atlantic 
or the Indian Oceans. The Caspian Sea is visited rarely. She might spot the Serpentine 
in London’s Hyde Park, and this might even be quite likely during the period just after 
take-off, but later on it would become extremely improbable. On the whole, she need 
not bother to check the sat-nav: the water she sees is most likely the Pacific Ocean. 

Such a view does not quite correspond to the second law since occasional decreases in 
total entropy are not supposed to happen. But it is a very good approximation when the 
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macrostate labelled a* with the largest value of S a absolutely dominates the phase space, 
such that Q ;/ * ~ )>~ (y Q u = Q. Then temporary decreases in S a will hardly ever be seen. 
Cases where there is such dominance provide us with a simple way to determine a new 
equilibrium state after the release of a constraint. We consider all the new macrostates (in 
the narrow sense) that are made available, and select the one with the largest microstate 
multiplicity. As long as the chosen macroscopic description carves phase space into 
patches that are hugely disparate in microstate multiplicity, corresponding in our flight 
example to an atlas with every ocean, sea and lake that is not the Serpentine labelled 
as one body of water, while the Serpentine itself is subdivided into smaller and smaller 
patches right down to the area occupied by each duck, then the quantity S a * is an 
excellent approximation to S B = k In £2. 

In real physical systems, there is likely to be strong dependence of Q on the parameter 
a as well as on N and E, such that vast disparity in the microstate multiplicities of dif¬ 
ferent macrostates is actually quite natural. The dynamics therefore select the macrostate 
with maximum multiplicity, or more colloquially (but with a slight misuse of termi¬ 
nology) they select the maximum entropy macrostate. The isolated system evolves to 
maximise its ‘entropy’ S a . This provides a rationale for the second law as a variational 
principle. We now examine some examples. 

8.1.2 The Maximum Entropy Macrostate of Oscillator Spikiness 

The phase space of the three oscillators with nine quanta studied in Section 4.5 has 
55 available microstates divided into nine groups, or spikiness macrostates, each with 
a specific multiplicity. We start with a constraint that holds the system in the Sp = 9 
macrostate of the phase space in Figure 4.3 and then release it. The total multiplicity 
Q.(N, 0) of available microstates changes instantaneously from 3 to 55, but our interest 
is in the value of F> Sp (/V, 0) as the system moves in phase space and Sp changes. 

What emerges is a time sequence of values sampled from the histogram shown 
in Figure 4.4. We would recognise the Sp = 5 macrostate as the maximum entropy 
macrostate, and identify k In Q Sp=5 (N = 3,0 = 9) = k In 12 as an estimator of the total 
Boltzmann entropy k In 55. We would frequently see k In f2 Sp (A, 0) fluctuate below this 
value as Sp changed with time, but for systems with larger N and 0, departure from the 
maximum entropy macrostate becomes much less frequent, as indicated in Figure 8.2. 
In such a thermodynamic limit, the histogram for Sp becomes sharp, as suggested by 
Figure 5.4, and most of the statistical weight is concentrated in a narrow region around 
the peak. Nevertheless, the estimate Sp = 5 of the mean spikiness (Sp) =5.18 in the 
example is a rather good approximation; so the procedure can be instructive even for a 
small system. 

8.1.3 The Maximum Entropy Macrostate of Oscillator Populations 

Now for a more complicated example that will have some value to us later on. We con¬ 
sider a set of N oscillators and divide the system phase space into macrostates according 
to a rather different kind of label: the set of numbers {n k } of oscillators that possess 
k quanta. It is a population distribution that satisfies the conditions )V /; n k = N and 
^2 k kn k = 0. For a system with 0 quanta, there will be 0 + 1 oscillator populations: 
the number n 0 of oscillators possessing no quanta, the number n l that hold one quantum 
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Figure 8.2 Time dependence of S a = k In £2 a as a system evolves from an initial macrostate into 
a larger set of macrostates labelled a, each with multiplicity £2 ff , as illustrated in the inset. In the 
thermodynamic limit, fluctuations away from the maximum entropy macrostate a* become rare 
and S a t may be taken to be a good estimate of the final entropy S B = k In £2, where £2 = 


and so on, up to a population Hq that holds all Q quanta. So for the case Q — 4, there 
are five populations n 0 _ 4 , some of which can be zero. 

There will be a certain number of manifestations of the population distribution, 
and these can be taken to define the different macrostates of the system. For N = 
3 and Q = 4, there are four macrostates of population distribution. In the notation 
(. n 0 ,n l ,n 2 ,n 3 ,n 4 ), these are (2,0,0,0,1), (1,1,0,1,0), (1,0,2,0,0) and (0,2,1,0,0). The last 
of these, for example, is the macrostate with two oscillators holding one quantum and 
one holding two quanta, accounting for the four quanta in all. The division of the phase 
space is illustrated in Figure 8.3. 


(n 0 , nj, n 2 , n 3 , n 4 ) 

^ ( 2 , 0 , 0 , 0 , 1 ) 

A (hi, 0,1,0) 

A ( 1 , 0 , 2 , 0 , 0 ) 

A ( 0 , 2 , 1 , 0 , 0 ) 

AAAA 

AAAAA 

Figure 8.3 Phase space of the N = 3, Q = 4 oscillator system divided into colour-coded 
macrostates labelled by the populations n k of the oscillators that possess k quanta. The microstate 
at the top vertex is identified in the notation of Section 4.2.2 as (q^ , q 2 ,q 3 ) = (0,0,4), which makes 
it a member of the macrostate with two oscillators possessing zero quanta and one possessing four, 
namely (n 0 ,n l ,n 2 ,n 3 ,n 4 ) — (2,0,0,0, 1). 
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We know that the microstate multiplicity of the entire phase space for parameters 
N = 3 and Q = 4 is ( Q +N — — 1)!) = 6!/(4!2!) = 15. How might these 

be grouped according to the labelling by population distribution? In other words, what 
is the microstate multiplicity of macrostate (n 0 , • ■ • ,Hq)1 

We start by working out how many ways there are to choose n 0 oscillators from the 
set of N oscillators in order to assign them zero quanta. We can choose any of the N 
oscillators to begin with, then any of the N — 1 remaining and so on. But this is an 
overcount since we might get the same bunch of oscillators in a different order. The 
number of possible sequences of the n 0 choices is n () ! and so the number of ways of 
selecting the n 0 oscillators is N(N — 1) • • ■ (N — n Q + 1 )/n 0 \. 

For example, from a set of three objects labelled A, B and C, there are three ways 
to choose two of them, namely AB, AC and BC. The procedure of selection we just 
described would generate the pair AB in two ways, so we correct for this by dividing 
by two. The formula with N = 3 and n 0 = 2 gives 3 x 2/2! = 3 as required. 

Similarly there are ( N — n 0 )(N — n 0 — 1) ■ ■ • (N — n 0 — n { + 1 )/n k ! ways to choose 
the n 1 oscillators from the remaining N — n {] , in order to assign them one quantum, and 


so on. The final Hq oscillators are chosen in (N — n 0 — n l — ■■■ — >ig_ l ) ■■■(N—n 0 — 
n l — ■■■ — rig | — Hq + 1 )/riQ ! = (N — n 0 — n 1 — ■ ■ ■ — ng_ x ) ■ ■ ■ 1 /tig ! ways since 
N = n 0 + ■ ■ ■ + rig. 


Multiplying together these factors for all Q + 1 populations, we find the number of 
ways in which the oscillators can be assigned a specified population distribution {n k } of 
quanta is written 

, • » 2 > 

"0 ■ n Q ■ 

and by definition this is the microstate multiplicity of the macrostate {n k } = (n 0 , ■ ■ ■ ,rig). 

As a check, let us work out the multiplicity of the four macrostates of the N — 3, 
Q — 4 system. The (2,0,0,0,1) macrostate corresponds to one oscillator holding all four 
quanta (n 4 =1), while the other two have none (n 0 = 2) and there are no oscillators 
with one, two or three quanta ( n l = n 2 = n 3 = 0). According to (8.2) this population 
macrostate has multiplicity 3!/(2!0!0!0!l!) = 3. Note that we take 0! to be equal to unity. 
The three microstates in the macrostate correspond to the vertices of the triangular phase 
space, where each oscillator in turn takes possession of all four quanta. 

By a similar reasoning, the (1,1,0,1,0), (1,0,2,0,0) and (0,2,1,0,0) population 
macrostates have multiplicities 6, 3 and 3, respectively. The sum of all the macrostate 
multiplicities, 15, is equal, as expected, to the total number of microstates in the phase 
space. We have successfully carved up the phase space into macrostates of population 
distribution, as illustrated in Figure 8.3. 

As discussed before, we can divide phase space according to any kind of macrostates 
that we might choose. The macrostate specification by population distribution {n k } is 
in principle distinct from the specification by spikiness Sp employed in Section 4.3 
(although in the example just considered, the pattern of the carving up is exactly the 
same). Different macrostates may be chosen depending on what macroscopic property 
we wish to study. 

The largest multiplicity is that of the (1,1,0,1,0) macrostate, coloured yellow in 
Figure 8.3, and this population distribution (one oscillator with three quanta, one with 
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one, and one with none), denoted {n A *}, is the most likely to be seen during the dynamics, 
assuming ergodicity. If the system were set up in macrostate (2,0,0,0,1) (the blue 
macrostate) and then allowed to evolve according to the dynamics, then much later the 
system would be presumed most likely to be found in the yellow macrostate (1,1,0,1,0). 

The estimate of the final equilibrium entropy k In £2{„*}(N, Q) — k In 6 is somewhat 
short of the correct value k In 15, but for a system approaching the thermodynamic limit, 
where the difference in multiplicity between different macrostates becomes vast, iden¬ 
tifying the maximum entropy macrostate gives a good approximation of the Boltzmann 
entropy S B = k In £2(N, Q), and provides an excellent estimate of system properties 
averaged over the complete phase space. 

Let us investigate this thermodynamic limit and identify the population macrostate 
{n k } that has the largest multiplicity E2^ n ^(N, Q). We start with (8.2) and write 

ln ^{ ni }(N,2) = InAM - ^ln n k \, (8.3) 

k 

and if we assume that all the n k are large, such that Stirling’s formula can be used, this 
becomes 


^Q { n k }(N,Q) ^N\nN - N -J2 n k lnn k 

k k 

= N \nN ~Y^n k \nn k = -XX ln (^)’ (8 ' 4) 

k k 

where we have used N = Yk n k- 

Now we find the maximum entropy macrostate by maximising In Ei^ n ^(N, Q), while 
ensuring that the conditions N = Yk n k an d Q ~ Hk^ n k are met - We use Lagrange’s 
method of undetermined multipliers and maximise 

I = -J2 n k ln Q)~ X Jl n ic-^Jl kn k’ 

k k k 

over the populations {n k }, where A and ( J > are constants. We assume that the populations 
are large, allowing us to treat them as continuous variables. Taking a partial derivative 
of I with respect to n k r then gives 

-ln^^ - 1 - A- Pk' = 0, (8.6) 

and so the population of oscillators holding k quanta in the maximum entropy 
macrostate is 

n k — N exp (—1 — A — pk). (8.7) 

We evaluate A through the condition ^2 k n k = N such that 

N 

n k = y ex P 


where Z = ex P 


( 8 . 8 ) 
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The maximum entropy population macrostate is therefore {n k } = (nX, n^x, n^x 2 , 
• • • ) where = N/Z, x = exp(— ft) and Z = xk = (1 — x) _1 , as long 
as P > 0 such that x < 0. Notice that we extend the upper limit of the sum to infinity 
since we are approaching the thermodynamic limit. 

Finally, the constant ft can be identified through the condition 

^ * N ^ - N dZ 

Q = J2 kn k = 2 I> ex P ( ~ Pk) = ~ZdI 

k k ^ 

N x x 

= -~ =N -, 

Z(l-x) 2 (1-jt) 

which leads to /J = ln(l +N/Q). Notice that the form of (8.8) resembles the canonical 
distribution, and that the parameter P(N,Q) is identical to the parameter fi(N,Q) of 
a system of N oscillators holding Q quanta, derived in (5.12), and which we came to 
regard as the dimensionless inverse temperature of such a system. 

The reason for this correspondence is that the fraction of the N oscillators that have 
k quanta in the thermodynamic limit, namely, nt /N since the properties of the system 
are then those of the maximum entropy population macrostate {n*}, will correspond to 
the probability that a single oscillator in the group should possess k quanta. The latter 
probability was shown to be Z 1 exp (—pk) in (5.13) using the canonical ensemble. The 
populations of oscillators with k quanta in the maximum entropy population macrostate 
appear to be distributed canonically. Indeed an argument similar to this is often employed 
as an alternative derivation of the canonical distribution, and it can be used in the context 
of a collection of general systems, not just a collection of oscillators. 

8.1.4 The Third Law of Thermodynamics 

The next use we have for Boltzmann’s entropy expression is to justify the third law 
of thermodynamics, namely that the entropy of a system should approach a universal 
constant (taken to be zero) as the temperature approaches zero. This limit corresponds 
to the removal of all energy from a system, if we ignore zero point energy, and this is 
the context in which we shall discuss the third law. 

A system with little energy to distribute amongst its various degrees of freedom has 
a very reduced multiplicity of microstates. We have already argued that the multiplicity 
is an increasing function of energy. The ultimate situation is that the smallest number 
of available microstates of a system applies when the energy is zero, and this number 
is unity if there is a unique ground state. The Boltzmann entropy of the system would 
then be zero. If there are several configurations of the system with zero energy, then the 
Boltzmann entropy at E — 0 would be nonzero, but physical systems do not commonly 
have such degeneracy. The situation is illustrated by the example of N oscillators with 
Q quanta, for which the microstate multiplicity is £1(N ,Q) = (Q +N — 1 )\/(Q T/V — 
1)!), and this goes to unity if Q = 0. There is just one microstate with zero energy, the 
one where every oscillator has zero quanta. 


N dZ dx 
Z dx dp 

(8.9) 
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8.2 Gibbs Entropy 

The Gibbs model of entropy differs from the Boltzmann approach in some important 
ways. It is necessary to have another model in order to describe the entropy of a sys¬ 
tem in contact with an environment. If energy transfers are possible, then a system 
might be able to access an unlimited number of microstates, and the Boltzmann expres¬ 
sion would be inappropriate. The key feature of the Gibbs approach is that it connects 
thermodynamic entropy directly to statistical ideas through the use of the equilibrium 
microstate occupation probabilities. It turns out that this yields a finite entropy for an 
open system. 

The microstate probabilities for a system in equilibrium are established using the prin¬ 
ciple of equal a priori probabilities, either for an isolated system or for the combination 
of system with a reservoir, and in the latter case, we would expect to obtain canonical 
system microstate probabilities, if the reservoir is large. The justification of the principle 
could be argued on the grounds of the ergodic hypothesis, the fundamental basis accord¬ 
ing to Boltzmann, but the principle could very well be regarded as the prime axiom 
in itself. This would follow from an interpretation of probability from the viewpoint 
of information theory: the microstate probabilities are not frequencies generated by the 
dynamics, but are rather a best judgement of the statistical weight accorded to each 
microstate. We might feel that no microstate of an isolated system should be weighted 
differently from any other, when we lack any information by which we might distin¬ 
guish them. This automatically leads to the canonical and grand canonical equilibrium 
microstate probabilities P i for a system in contact with different reservoirs, as we saw 
in Chapters 5 and 7, and we need not worry about ergodic dynamics. 

The Gibbs entropy .S G of a system is defined by the important formula: 



( 8 . 10 ) 


where i labels the microstates of the system. Let us insert the canonical probabilities 
Pj = Z 1 exp (— E i /kT ) for a system in contact with a heat bath at temperature T to see 
what this implies. We find that 



( 8 . 11 ) 


where the brackets indicate a canonical average. A clear indication that the Gibbs entropy 
differs from the Boltzmann entropy is that S G is a function of the reservoir property T 
and the number of particles N. It therefore naturally applies to a system in contact with a 
heat bath. In contrast, S B is a function of conserved quantity E, and is designed to apply 
to an isolated system. Of course, in the thermodynamic limit, when fluctuations about 
mean properties become negligible, there is an essentially unique equilibrium system 
energy associated with a given reservoir temperature, and the mathematical distinction 
between functions S G (N,T) and S B (N,E) becomes less important. 
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The Gibbs expression is compatible with the maximum entropy macrostate estimate of 
Boltzmann entropy derived in Section 8.1.3. According to a generalisation of (8.4) 
we write 

S B = k\nQ ^ k\nQ at = -NkJ^ ln . (8.12) 

for a system consisting of N similar subsystems (oscillators in that earlier discussion) each 
able to assume one of a set of subsystem microstates now labelled by i. The number of 
subsystems in microstate i is n*, for the maximum entropy population macrostate, and this 
is found to take the canonical form (8.8). The probability P t that a particular subsystem 
should assume microstate i is n*/N; so we can equate the Boltzmann entropy of a set of N 
similar subsystems, in the thermodynamic limit, to be N times the Gibbs entropy (defined 
in (8.10)) of each one: namely, S B {N,E) = NS G ( 1, T), with E and T related uniquely. 
Turning to more straightforward matters, note that (8.11) can be rewritten as 

-kTlnZ = {E)-TS g , (8.13) 

and so with reference to (3.13), and if we accept a correspondence between the thermo¬ 
dynamic state variable E and the statistical mean (E), we appear to be able to identify 
—kT In Z with the Helmholtz free energy: 

F = -kT\nZ, (8.14) 


just as we found in (6.16). This correspondence lends support to the form of entropy 
proposed by Gibbs. Note that Z in (8.14) is a function of T and N, and that together 
with the system volume, these are the natural variables for the Helmholtz free energy in 
classical thermodynamics. 

The Gibbs entropy in (8.10) is a sum over microstates, but we can group the microstates 
into energy macrostates a with equilibrium probabilities P a = Q (/ exp (—( J >E a )/Z to 
obtain the expression 

S G = -kJ2P a hif^fV (8-15) 

n/ \ O' / 


where Q u is the microstate multiplicity of the macrostate a. Whether expressed in 
terms of microstate or macrostate probabilities, the Gibbs entropy is independent of 
the choice of macrostate patchwork, because like the Boltzmann entropy we wish it to 
be a measure of the size of the whole available phase space. 

If we employ the grand canonical microstate probabilities p9 = Z (j 1 exp | — ( E i — 
pNf)/kT] in the Gibbs expression for the entropy, we obtain 


= -kY' P G + — - In Zc] = — (E) - 

^ \ kT kT G T W 


#) 


k ln Z r 


(8.16) 


where the angled brackets now represent a grand canonical average. This can be rear¬ 
ranged into 

—kT\nZ G = (E) — TS g — p(N) = <J>, (8.17) 


where <t>(/i, V , T) is the grand potential defined in (3.31), interpreting the system energy 
and particle number in that classical thermodynamic expression as grand canonical 
averages. This is another connection between a quantity from statistical thermodynamics 
(the left hand side) and a quantity from classical thermodynamics (the right hand side), 
similar to the connection established in (8.14). 
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8.2.1 Fundamental Relation of Thermodynamics and Thermodynamic Work 

If we accept the Gibbs expression for the entropy, then it allows us to derive familiar clas¬ 
sical thermodynamic relationships starting from the ideas of statistical thermodynamics. 
Consider the change in mean system energy in the canonical ensemble brought about by a 
change in the canonical probabilities, brought about by an alteration to the reservoir tem¬ 
perature, for example, together with a variation in a parameter x that affects the microstate 
energies E t . From (E) = JT E i P t , where the P t are canonical probabilities, we write 

dE 

d (E) = y E i dp i + Y p ' dE '= ~ kT J2 {lnp i + ]nZ ) Ap < + Y p > 7p dy 

i i i i 

= —kT Y ln/ J ,d/ J , -f x dx, (8.18) 


where the condition JTdP,- = 0 has been employed in order to maintain normalisation 
^2jPj = 1, and where we define a thermodynamic force/,, associated with x: 


f = _y>3 

Jx Z— ' dx 



(8.19) 


Next we note that 

= -kY lnP i dP i’ ( 8 . 20 ) 

i 

again having used '}2 i dP i = 0. Combining this with (8.18), we get 


dS G = -k 


YlnPidP i+ Y P i(y 


d P : 


d{E) = TdS G -f x dx. (8.21) 

This resembles the fundamental relation of thermodynamics dE — TdS — pdV. if we 
associate the classical state variable E with the canonical average (E). The second term 
on the right hand side of (8.21) should therefore correspond to the quasistatic work done 
on the system, which we can define in statistical thermodynamics to be the contribution 
to the change in mean system energy brought about by a change in a parameter that 
affects microstate energies, on the assumption that the microstate probabilities remain 
canonical in form throughout the process. 

For example, if the system microstate energies depended on system volume V, then 
— f v dV is just the change in mean system energy brought about by increasing the volume 
by dV. The thermodynamic force f v would correspond to the system pressure p. 

Notice that (8.19) implies that 


rjv 
( 8 . 22 ) 

which is compatible with (3.35), namely p = —(8F/dV) T , if we again employ 
F — —kT In Z. This is a powerful result that enables us to calculate the pressure in 
general systems. We shall check that it reproduces the pressure of an ideal classical gas 
when we develop the statistical thermodynamics of such a system in Chapter 9, and we 
shall employ it in Section 13.3.3 to calculate the pressure of electromagnetic radiation. 


1 dE: ( E:\ kT 9 ^ 

~ ~Z^W eXP \kT ) ~ YdV ^ 6XP 


Ej(V) 

kT 


= kT 


31nZ 

dV 
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8.2.2 Relationship to Boltzmann Entropy 


The machinery of classical thermodynamics seems to emerge naturally if we base 
statistical thermodynamics on the Gibbs entropy and on canonical probabilities, not¬ 
ing that some of the classical state variables then correspond to canonical averages of 
microscopic quantities. Nevertheless, the Gibbs and Boltzmann entropies are closely 
related to each other in the thermodynamic limit. The way to demonstrate this is to start 
with the Gibbs entropy expressed in terms of macrostate probabilities 

S G = -kJ2P* ln(§ L )’ < 8 ' 23 ) 


and to employ an expression for P a that is appropriate as the thermody¬ 
namic limit is approached. We label the macrostates by energy E a and write 
S G = —kJ2 a P(E a )ln(P(E a )/£2(E a )). Following (5.28), the P(E a ) may be taken to be 
approximately Gaussian: 


F(£„) =Z- 1 £2((£»ex p(-^ 


exp - 


C K ~ ( E)Y 

2 ct | 


1 

W 


exp - 


0 K - (E)Y 

2ctJ 


where ( E ) is the canonical mean energy of the system, cry 

\2 n „2 


(8.24) 

is the variance in system 


energy, and W = exp (—(E a — {E)) 2 / 2crJ). Next, we expand the logarithm of the 
multiplicity of the macrostates about the mean energy: 


9 In E2UE)) E n — (E) 

In«(£„) « In«((£)) + (E a - (E)) - " = In£2((£)) + “ V (8.25) 

a{E) kT 


making use of the equality between the f3 parameter of the reservoir (1 /kT) and that 
of the most likely macrostate (for which E a — (E)), as discussed in Section 5.4. This 
implies that (lnf^^^)) & lnf2((£)), which allows us to write 


S G = * ^ n (E a ) - k P(E a ) In P{E a )^k In «((£)) 

a a 

-k£P(E a )lnP(E a ). (8.26) 

a 


The final term may be written 


-£ £>(£„) In/>(£„) 

a 


{(E a — {E)) 2 ) +klnAf — k 

2 °E 




(8.27) 


We now recognise that as the thermodynamic limit is approached, very few macrostates 
are actually visited, and in the limit, only the one with E a = (E). This implies that the 
normalisation constant W is of the order of one, while Q ( (F)), the microstate multiplicity 
of the macrostate at the mean energy, is enormous. Thus from (8.26) and (8.27), we find 
that the Gibbs entropy takes the form 


S G KklnQ((E}), (8.28) 

for large systems, taking the Boltzmann form in terms of the multiplicity of the macrostate 
at the mean energy appropriate to the prevailing reservoir temperature. 
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8.2.3 Third Law Revisited 

Its correspondence with the Boltzmann entropy in the thermodynamic limit suggests that 
the Gibbs entropy is compatible with the third law of thermodynamics, but this can be 
demonstrated more directly. 

The canonical probability of any microstate goes to zero as T -» 0 owing to the form 
of the Boltzmann factor exp(—EJkT), unless the microstate energy is zero. Since the 
limit of the quantity x In x is zero as x —> 0, contributions to the sum — A: In P t from 

nonzero energy microstates vanish as T —> 0. Therefore, if there is a unique ground state 
with zero energy, then as T —» 0 its probability of occupation approaches unity. Thus 
S G —&(l)ln(l) = 0 as T 0. The third law is satisfied unless there are numerous 
ground states of the system. 

8.3 Shannon Entropy 

Shannon entropy is a further generalisation of Gibbs entropy for which various claims 
are made, including applicability away from equilibrium. It is also known as information 
entropy Sj because of its links with information theory, a branch of mathematical logic 
with its roots in the optimisation of transmission rates in communication systems, and 
developed initially in 1948 by Claude Shannon (1916-2001). 

The idea is that the combination of microstate probabilities used in the Gibbs entropy 
provides a measure of the uncertainty embodied by a probability distribution, and that 
the actual probabilities should be determined by maximising this uncertainty subject to 
any known constraints. These ideas are based on the information theoretic interpreta¬ 
tion of probability as a best judgement of the likelihood of outcomes, rather than as a 
representation of their frequencies of occurrence in trials, as alluded to in Section 4.1. 

The starting point is the expression 

S, =~kJ2 ln p h (8.29) 

i 

in terms of a set of microstate probabilities P f . This looks the same as the Gibbs entropy, 
but the equilibrium probabilities P t are not this time assumed to be canonical or grand 
canonical, on the basis of the principle of equal a priori microstate probabilities applied 
to a microcanonical ensemble of system and environment. The Shannon expression is 
claimed to be universal. The first piece of evidence in favour of such a viewpoint is 
to notice that the form of the Shannon entropy is compatible not only with the Gibbs 
entropy, but also with the Boltzmann entropy for an isolated system. If the appropriate 
microstate probabilities P l = 1 / G are inserted, where G is the number of microstates, 
we find that 

S,=-kJ2 Pi lnP i = k J2 n -1 In G = k ln G = S B , (8.30) 

i i 

As just mentioned, it is axiomatic that the probabilities P f are determined by maximis¬ 
ing the Shannon entropy, taking due consideration of appropriate constraints. This has the 
appealing feature that the equilibrium probabilities in the microcanonical, canonical and 
grand canonical ensembles can be obtained using the same methods. The maximisation 
may be carried out using the Lagrange multiplier approach employed in Section 8.1.2. 
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For example, in the microcanonical case, the only constraint is the normalisation 
condition = 1, and the maximisation of .S', — with respect to the P { , where 

X is a Lagrange multiplier, proceeds as 



—k In Pj — k — X — 0, 


(8.31) 


which implies that all the P, are equal. This is clearly the most uncertain we can be about 
the microscopic state of a system, and the Shannon entropy would be k In £2 as we just 
showed. In contrast, if we knew that a system was definitely in a particular microstate, 
then the Shannon entropy would be zero (since either P t or I n P t would be zero for all 
microstates), which would then be a representation of the minimum of uncertainty. 

If the maximisation of Sj were carried out under the additional constraint of a known 
mean energy (E) = JT£j- P i . the resulting microstate probabilities would be determined 
from 





-k In Pj - k-X- X'Ej = 0, 

(8.32) 


implying that the P, would be canonical, namely P i = Z 1 exp (—/IP,) with 
Z = JT exp (—/} Ej) and f J > = X'/k. Notice that the parameter fi in this expression 
derives from a Lagrange multiplier, and is to be identified through the expression 
(E) = Z 1 P, exp (—/!£,) that relates it to the known mean energy. In contrast, in 
the usual derivation of the canonical ensemble, /I is a property of the reservoir. In 
a similar way, by specifying a known mean particle number, we would be able to 
generate the grand canonical microstate probabilities. 

This differs from Boltzmann’s viewpoint, which emphasised that the appropriate 
probabilities should arise through an analysis of the microscopic dynamics. It also goes 
beyond the principle of equal a priori probabilities, the basis of Gibbs’ development, by 
providing an underlying rationale for that principle. By maximising the uncertainty in 
the statistical description, as embodied by the Shannon entropy, while constraining the 
description to be consistent with whatever information we possess, we specify the least 
biassed probability distribution, or the logical best guess. On the other hand, Shannon 
entropy does require us to accept a particular view of probabilities, and opinions about 
this can differ. 

Yet another definition of entropy, similar to Shannon entropy, and named after von 
Neumann, has been used to explore uncertainty in quantum mechanics, where the prob¬ 
abilities of state occupation are affected by intrinsic quantum uncertainties, as well as 
by classical statistical uncertainty. This is an important tool in the fields of quantum 
information processing and quantum computers. 


8.4 Fine and Coarse Grained Entropy 

Boltzmann’s insight that entropy is a measure of the number of accessible microstates 
raises a rather fundamental question. How do we define a microstate? So far, we have 
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discussed simple systems such as sets of oscillators where there is no doubt that a 
microstate is defined by an arrangement of quanta, but we intend to apply the same 
ideas to physical systems, notably ones involving atoms and molecules. Is it enough 
to specify positions and velocities at the level of atoms, or should we dig deeper and 
specify the coordinates of the nuclei and electrons, or even of the protons and neutrons 
inside the nucleus? Surely this affects the value of the entropy? 

Of course, the absolute value of entropy does not matter if all we need is an entropy 
difference. For example, imagine that each of the 55 microstates in the phase space of 
N = 3 oscillators with Q = 9 quanta, illustrated in Figure 4.3, is actually a collection 
of O underlying microstates. Is the Boltzmann entropy equal to k In 55 or k In (5 5 £2)? 
To a certain extent, it does not matter. Evaluating the difference in entropy on releasing 
a constraint to move from the spikiness Sp = 9 macrostate into the entire phase space 
gives an entropy change AS = k ln(55/3) and the dependence on £2 has vanished. This 
assumes that the underlying multiplicity of each (apparent) microstate is the same. But 
if we did want to work with the absolute entropy, we simply do not know how many 
layers might lie beneath the subatomic scale. So we have to satisfy ourselves with an 
entropy that is defined with respect to some graining of the structure of matter. 

A similar issue arises in the following circumstances. If a gas is discovered to have two 
isotopes, should this invalidate any previously published tables of entropy measurements? 
After all, from that moment on, when we perform an experiment on the gas we are aware 
of additional uncertainty with regard to the mass of every atom. But if the isotopic 
composition has no measurable effect on the thermodynamic processes of interest to us, 
then the answer is no: the additional microscopic degree of freedom is irrelevant and 
need not be taken into account in calculating the entropy. 

The next matter that comes to mind is whether entropy depends on the choice by which 
we divide phase space into macrostates. This is apparent in the procedure where we take 
the entropy of the maximum entropy macrostate as an approximation to the total entropy. 
If we choose a finer scale macrostate patchwork across the phase space, does this not 
imply that the entropy is smaller than that obtained from a coarser scale patchwork? The 
answer is no: we recall that the entropy of the maximum entropy macrostate is only an 
approximation to the total entropy. The size of the accessible phase space is independent 
of the way we carve it up into macrostates, and this is the correct measure of the entropy. 

The practice of using larger patches of phase space to provide a coarser scale descrip¬ 
tion of the system configuration is known as coarse graining. While a probability 
distribution across such coarse macrostates would appear to embody less uncertainty, 
when properly interpreted through (8.23), for example, it nevertheless provides a mea¬ 
sure of the number of underlying microstates. Much confusion can arise from failing to 
appreciate these points. 

8.5 Entropy at the Nanoscale 

Statistical thermodynamics can be applied to equilibrium systems of any size, as long 
as we are prepared to impose the principle of equal a priori probabilities, and this 
implies that entropy is a property of systems large and small. This is a departure from 
its classical roots in macroscopic thermodynamic behaviour, but a perfectly acceptable 
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extension. In recent years, the feasibility of performing well-controlled heat and mass 
transfer experiments on systems at the nanoscale (which often means at submicron length 
scales) has created a requirement for a formalism to match. The increase in the Gibbs or 
Shannon entropy upon equilibration of a system still holds, but the Boltzmann concept 
of a time-dependent entropy estimate k In that tends to rise but sometimes falls is 
also useful. This is a matter for current development. 


8.6 Disorder and Uncertainty 

Finally, let us revisit the question raised in Chapter 1: is entropy a measure of disorder 
or of uncertainty? The development in this chapter so far would seem to support the 
latter: in fact disorder has not been mentioned at all. But the idea that entropy represents 
disorder is nevertheless extremely well embedded in the literature, and is how most 
introductory discussions are framed. 

I have personally found the emphasis on disorder confusing in that it raises unnec¬ 
essary questions. For example, the simple point can be made that the evolution of the 
universe does not always involve a transition from order into disorder. Galaxies and 
planets form from the cosmic chaos, and intelligent life itself has developed from the 
chemical soup here on Earth. This sounds like order emerging from disorder, and the 
suggestion is sometimes even made that life, with its ability to adapt to its environment, 
and intelligence, with its appreciation of order and desire to impose it on its surround¬ 
ings, somehow does not obey the second law. And then, just what is meant by order? 
Is it spatial repetition? Why is this special? Nature simply evolves one arrangement of 
the world into another. A snapshot of a system is just an configuration of its atoms and 
any perceived order within it can be simply a matter of taste. 

Furthermore, it confuses me to suggest that energy exists in forms that are ordered or 
disordered and that the imperative is to evolve from one to the other. The concept of the 
reduction in the ‘quality’, or concentration of energy, alluded to in Section 2.16, does 
help since it is a more abstract formulation, but applying the concept of order to energy, 
or to the particles that possess it, for me carries too great an implication of regularity in 
spatial pattern. We need to unpick such arguments with care. 

We could perhaps rephrase the explanation by noting that systems of particles often 
have a tendency towards disorderly or chaotic evolution, in certain circumstances. The 
word disorderly suggests uncertainty in future behaviour while disordered suggests a 
lack of spatial pattern in the present: a subtle distinction. A disorderly crowd is less easy 
to describe, and less predictable, than an orderly one. 

On the other hand, the idea of disorder does capture something of Boltzmann’s insight 
that entropy represents uncertainty in microscopic configuration. Declaring that a set of 
particles is spatially ordered means we impose demanding geometrical criteria that few 
configurations can satisfy, and relaxing the requirement of order allows many more to 
meet the specification: ordered systems therefore come in fewer possible arrangements 
than disordered ones, and the connection with uncertainty follows. The mistake is to 
claim that low uncertainty is only ever associated with spatial order: we could have 
considered a demanding set of criteria that had nothing to do with spatial order: that 
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the particles should lie within a short distance of some randomly selected points, for 
example. We are then referring to certainty, not order. 

The growth of disorder is a useful shorthand for the increase in entropy, but it is not 
the whole story. Nevertheless, it does convey the idea of decay and decline in natural 
processes, and of dissipation of energy and matter, the ultimate destination being the 
‘heat death’ of the universe, when all distinctions have been lost leaving uniformity 
everywhere. It is a good story, though clearly there are exciting events that take place 
alongside the decline. And we should not spend time wondering whether we should 
regard the heat death universe as ordered (in that there is uniformity) or disordered (it 
must have lots of entropy). Its microscopic state is simply maximally uncertain. 


Exercises 

8.1 Consider two ideal gases, one coloured red and one coloured green. A container is 
divided in half by a partition that is permeable to red gas particles only. Initially, 
the right hand subvolume of the container holds red gas at a certain pressure, while 
the left hand subvolume holds red and green gas in equal proportions such that the 
total pressure on each side of the partition is the same. The container is in thermal 
contact with a heat bath at temperature T. Determine the chemical potential of red 
gas in the two subvolumes and deduce whether the partition, if it is free to move, is 
likely to slide to the left, to the right, or remain in place. A colour-blind professor 
observes the system and believes a law of thermodynamics is being violated: which 
one and why? 

8.2 Show that the microstate multiplicity of a macrostate of N quantum harmonic oscil¬ 
lators, labelled by the set of populations {n k } = {n 0 ,n l , - ■ ■ , Hq), where n k is the 
number of oscillators that possess k quanta, is given by 

N\ 

V(N ,Q,{n k }) = —--, 

n o !•" V 

such that Y.U n k = N and Y^k=okn k = Q, where Q is the fixed total number of 
quanta in the system. For the case N — 3 and Q = 4, identify the four macrostates 
and their microstate multiplicities. A measurement is made of the ‘spikiness’ of 
the oscillator system, defined as the difference in number of quanta possessed by 
the highest occupied and lowest occupied oscillator at a given instant of time. 
Show that the four population macrostates labelled by the {n k } are each charac¬ 
terised by a unique spikiness value, and list those values. Determine the probability 
distribution of the system over the macrostates, assuming that the statistics are gov¬ 
erned by the principle of equal a priori probabilities. For a system where N and 
the n k are very large, show that In n k In (n k /N). Flence show that the 

maximum entropy macrostate of this system is characterised by the populations 
nf — N exp (—kfi)/ Ylm=o ex P where is a constant. 

8.3 Derive the grand canonical microstate probabilities by a constrained maximisation 
of the Shannon entropy. 



9 

Statistical Thermodynamics 
of the Classical Ideal Gas 


In this chapter, we develop the statistical thermodynamics of a system of noninteracting 
particles in order to model the classical properties of ideal gases at high temperatures or 
low densities. Some of the strangest phenomena in science are to be found when we cool 
a gas towards absolute zero, or increase its density far enough until deviations from the 
classical gas laws emerge. Such systems are known as quantum gases, and will be treated 
in Chapters 10-12. Nevertheless, our treatment of a classical gas is based on a quantum 
mechanical treatment of the particles, where we establish the system microstates and 
their energies, and then construct canonical averages through the partition function. Our 
aim will be to recover the entropy function describing the monatomic classical ideal 
gas, which was the focus of attention in Chapter 2. In order to do so, we shall find it 
necessary to employ another ingredient of quantum mechanics: the indistinguishability 
of particles and the requirement that the eigenstates of a system of many particles should 
satisfy certain symmetry requirements. 

A system of noninteracting particles confined by a common external potential, such as 
a box, is relatively straightforward to analyse quantum mechanically. We can construct 
the description from the solution to the problem of a single particle in the system. The 
eigenstates of the particle are known as ‘single particle states’. The procedure is similar 
to determining the shells available to an electron orbiting a nucleus, and then filling them 
with as many electrons as are available, subject to various rules. 


9.1 Quantum Mechanics of a Particle in a Box 

The problem of N noninteracting particles in a cubic box of volume V is reducible to 
one of a single particle because the Hamiltonian operator in the Schrodinger equation 
separates into kinetic energy terms for each particle. This means that the wavefunction 
of the N particles is a product of wavefunctions of individual particles, or indeed a sum 
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of such products, as we shall see. Therefore, we focus our attention on determining the 
wavefunctions and energies of a single particle in the box. We start with 

h 2 

- — V 2 VKr) = <?VKr), (9.1) 


where m is the particle mass and i//(r) is the single particle wavefunction. Writing 
i/c (r) = \j/ tu (x ) \jr n (y )\[r n _ (z), this separates further into equations for 1-d wavefunctions 
of the type 


h 2 d 2 jr, Ix (x) 
2m dx 2 


(nj'njx). 


(9.2) 


where the wavefunctions and energies are labelled with the index n x , and where e = 
€, u + c n + e lh . The boundary conditions are \l/ llf (x) = 0 at x = 0 and x = /, where / is 
the length of a side of the cubic box. The solutions are 


i'nM) a sin(£ x x), (9.3) 

in terms of an x-component of the wavevector k x = im x /l, with n x a positive integer. 
The associated energy is 

h 2 k 2 h 2 n 2 

c = -— = -—. (9.4) 

* 2m %ml 2 K ’ 

Notice that these are not evenly spaced, in contrast to the quantised energies of a 1-d 
harmonic oscillator. 


9.2 Densities of States 


The single particle states may be visualised as a set of points arranged as a 3-d cubic 
lattice, each specified by a wavevector k = (k x ,k y ,k-) with magnitude k. The points lie 
in what is called k-space, or sometimes reciprocal space, since in a manner of speaking it 
is an inverse of the space corresponding to particle position r, in the sense that functions 
of r can be represented as Fourier transforms of functions of k, and vice versa. A point 
in this k-space represents a standing wave in r-space. 

From the quantisation conditions, the nearest neighbour distance between k-space 
lattice points is A k = it/l. This is illustrated in Figure 9.1. The energy of each state is 


8 ml 2 


(«; + n 2 


+ <)> 


(9.5) 


and this is proportional to the square of the distance from the origin to the lattice point 
k representing the state. All states lying at the same distance from the origin in k- 
space have the same energy. Each state has a degeneracy of (2s + 1) for particles of 
spin s, corresponding to different values of the quantum number m s . Thus, the points 
in Figure 9.1 represent the phase space available to a particle with a specific value of 
m s and the whole picture would need to be duplicated (2s + 1) times to include all 
orientations of the spin. 
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Figure 9.1 Wavefunctions describing a single particle confined to a cubic volume V = I 3 take 
the form of standing waves specified by wavevectors that form a cubic array in what is called 
k-space, with allowed components = rij A k defined by positive integers rij , some of which 
is shown. We divide this phase space of a single particle into macrostates with a magnitude of 
wavevector in the range k —»• k + dk and estimate their microstate multiplicity by counting the 
number of single particle microstates within the octant shell indicated. 


The canonical partition function is a sum of Boltzmann factors over all available 
microstates: 

z x = Y, (2* + D ex P (-0e k ). (9-6) 

n x ,n y ,n z 


where the spin degeneracy factor is seen to play the role of a multiplicity of microstates 
for the macrostate labelled by the wavevector k. It is more convenient, however, to group 
the microstates into macrostates labelled by a range of the magnitude of the wavevector. 
We write 

Zt = 


f 


p(k)exp (—/3e(k))dk, 


( 9 . 7 ) 


where p(k)dk is the multiplicity of microstates in the range k —* k + d&. p(k) is called 
the density of states in wavevector. 

We calculate p(k ) by counting the number of wavevector lattice points lying between 
spherical shells of radius k and k + dk in the sector of k-space with positive wavevector 
components. The volume of this region, multiplied by the spin degeneracy, and divided 
by the volume per lattice point (A k) 3 , gives the microstate multiplicity: 


p(k)dk = (2s + 1) 


4 k k 2 
8 


1 

(A W 


(2s + 1)£ 2 / 3 
2k 2 


dk. 


( 9 . 8 ) 


and so we write 


p(k) = 


(2s + 1 )Vk 2 
2k 2 


( 9 . 9 ) 


since Z 3 is equal to the volume of the box V. 
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This approach can be used for counting states that take the form of waves in boxes. It 
is usual to convert p(k) into a density of states with respect to the energy e of the wave. 
For the present problem, where the wave describes a nonrelativistic quantum mechanical 
particle in a box, the connection between energy and wavevector, or dispersion relation, is 
e = h 2 k 2 /2m. We define the multiplicity of microstates in the energy range e —> e + de 
to be g(e)de, where g(e) is the density of states in energy. Since the multiplicity in this 
range is the same as that found over the corresponding range of k, we have 

g(e)de = p(k)dk, (9.10) 


in which case 


g(0 = 


(2s + 1 )Vk 2 

d k 

(2s + l)Fk 2 m 


2n 2 

de ” 

2 jt 2 h 2 k 


(2s + l)mV 

^ 2 me 

y ( 2 s + i)v i 

'2m\ 

2tt 2 h 2 

l n 2 , 

) (2lt) 2 \ 

,h 2 ) 


(9.11) 


This density is illustrated in Figure 9.2. Its nonlinearity is a consequence of the uneven 
spacing of the single particle energy levels alluded to in Section 9.1. 


9.3 Partition Function of a One-Particle Gas 

The partition function representing the one-particle system is now given by 
f 00 (2s + 1)V /2m\ 2 r°° i 

= l *( 6) “p L eZexp( -^ )de - (9 - 12) 



Figure 9.2 The density of states g(e), with respect to energy e, of a single particle in a box. The 
magnification of the energy axis indicates explicitly that the spectrum of states steadily becomes 
denser as e increases. 
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The integral may be evaluated by the transformation fte = z 2 and employing 
/ 0 °° z 2 exp (—z 2 )d^ = n l/2 /4, giving 


Zj = (2s + lW, 


(9.13) 


where we define the thermal de Broglie wavelength as 

42 


V 2mn / 


l 


LV 

2itmkT ) 


(9.14) 


We shall show that this is the same quantity as was introduced in (2.52). 

It is tempting now to write the partition function of an /V-particle gas as the product 
of N single particle partition functions. A microstate would be identified by the set of 
wavevectors {k ; ) labelling the single particle states occupied by particles j = 1 ... N, 
together with a specification of the particle spin orientations. The total energy would be 
E — . After all, we did precisely this for the system of vacancies in (6.31). We 

would get 

Z N = (2s + 1)* J2 ex P (- P e kj) 

{k/1 

= (2s + l)^exp (—/6e kl ) 

kl 


(2s + D^exp (-£e kjv ) 


, (9-15) 


so that Z N = Z| V , since each factor in brackets is the partition function of a single particle 
in the box. But this would be wrong, and would lead to the so-called Gibbs paradox. 
We have failed to take account of particle indistinguishability. 


9.4 Distinguishable and Indistinguishable Particles 

We now need to discuss the effect of the indistinguishability of particles in statistical 
thermodynamics. The issues can be explored using the system of harmonic oscillators 
discussed extensively in Chapter 5. A microstate of this system is a specification of 
the energy possessed by each harmonically bound particle. It was not emphasised in 
Chapter 5, but we do so now, that the particles were distinguished by their interaction 
with distinct, spatially separated, harmonic potentials, as illustrated in Figure 5.1. Equiv¬ 
alently, all the particles could be imagined to be moving in a single potential, but could 
be distinguished by each having a different colour, for example. The enumeration of 
microstates was performed on this basis. 

But this changes when a system is composed of particles that cannot be distinguished. 
For example, consider the case of two harmonically bound particles viewed from an 
angle that makes them indistinguishable, as illustrated in Figure 9.3. Our perception 
would be that there is just one microstate of the system with the particle oscillation 
amplitudes shown, not two. For the case of two colourless particles interacting with the 
same potential, this is more than just a perception. According to the rules of quantum 
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Figure 9.3 Two harmonically bound particles are viewed from the side. The two situations 
shown, where the right hand particle and then the left hand particle has the higher energy, cannot 
be distinguished. By analogy, when two indistinguishable particles are held in the same potential, 
we must take care not to overcount the microstates when constructing thermodynamic ensembles. 


mechanics, the two arrangements simply merge into one. Because of this, we have to be 
careful not to overcount the microstates in systems involving particles confined to the 
same spatial region. 

Taking an explicit example, the indistinguishability of particles means we cannot 
label microstates of the oscillator system as we did in Section 4.2.2 using the numbers 
of quanta in each oscillator {qj}. For three particles in the same harmonic potential, the 
microstates previously denoted (1,0,0), (0,1,0) and (0,0,1) in notation (q { ,q 2 ,q 3 ) are in 
fact identical if we cannot distinguish the particles. To repeat: this is not just a matter 
of failing to perceive the difference. There is no difference. 

The approach described in Section 9.1 might suggest that three states exist with 
wavefunctions fo( T i)fi( T 2)fo( r 3) and 1 M r i)V f o( r 2 )V f i( r 3 )> where 

iJ/'qIxj) and i/q (r y -) are the ground state and first excited state of the / 1 h harmonically 
bound particle. But this is incorrect. The point is that quantum mechanics ought to be 
built without particle labelling. If we insist on using labels as an intuitive device, then 
all physical results need to be independent of the way in which the labels are assigned 
to individual particles. So the three states must be fused into one, and the ways in which 
this can be done are explored in greater detail in Chapter 10. 

We need to find another description of a microstate of the oscillators, and the natural 
one to turn to is to specify the population distribution of particles in the so-called 
single particle states of the potential. This is intrinsically free of particle labelling. For a 
harmonic potential, the single particle states have energies hco/2, 3ha>/2 and so on. So 
the three oscillating particles with one quantum of energy between them just considered 
are represented as a single microstate corresponding to a population distribution of two 
particles in the ground state and one in the first excited single particle state. Such a 
description avoids specifying which particle is in which state, a situation that is then quite 
acceptable. 

The labelling of the states of this system according to populations of oscillators with 
specified numbers of quanta (n 0 ,n { ...) was explored in Section 8.1.2. We introduced this 
scheme to describe macrostates of a system of distinguishable particles. Now we employ 
it to provide a way to specify microstates when the particles are indistinguishable. There 
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are fewer microstates than was the case when we studied distinguishable particles in 
separate potentials. For example, the N — 3, Q — 4 system has 15 microstates if the 
particles are distinguishable, but only four if they are indistinguishable. In the nota¬ 
tion (n 0 , n ] , n 2 , «3, rc 4 ) the microstates are labelled (2,0,0,0,1), (1,1,0,1,0), (1,0,2,0,0), 
(0,2,1,0,0), as we found in Section 8.1.3. 

Indistinguishability has a profound effect on statistical thermodynamics. According 
to Boltzmann’s principle, the entropy for three indistinguishable particles that together 
possess four quanta is kin 4 when they are held in the same potential, in contrast to 
k\n 15 if the particles are held in spatially separated potentials. The number of possible 
microstates changes and ensemble averages of system properties differ as a consequence. 

As another example, consider the N — 3, Q = 9 oscillator system. The phase space for 
distinguishable oscillators, or particles, was given in Figure 4.3, divided into spikiness 
macrostates with multiplicities shown in Figure 4.4. If the particles are indistinguishable, 
however, the phase space is cut down in size as illustrated in Figure 9.4. Microstates 
in Figure 4.3 that are distinguishable from one another only through the swapping of 
particle labels are merged. Thus the three red microstates at the vertices of Figure 4.3, 
denoted as (q x ,q 2 ,q 2 ) = (9,0,0), (0,9,0), and (0,0,9), and considered to be different 
if the particles are distinguishable, are replaced by a single red microstate in Figure 9.4 
that could be labelled (n 0 ,n l ,n 2 ,n 3 n 4 ,n 5 ,n 6 ,n 7 ,n 8 ,n g ) = (2,0,0,0,0,0,0,0,0,1). We 
can regard the new phase space as a folded down version of the old phase space, in this 
case approximately one-sixth of the size. 

The revised number of microstates in each spikiness macrostate is illustrated in 
Figure 9.5. Notice that the statistics of spikiness, such as the mean and standard devi¬ 
ation, differ considerably with respect to the situation in Figure 4.4 for distinguishable 
particles. 



Figure 9.4 An illustration of the phase space available to N = 3 indistinguishable particles (or 
oscillators) possessing Q = 9 quanta. The reduced phase space of twelve microstates comprises 
approximately 1/6, or 1 /N\ of the original phase space given in Figure 4.3, shown in outline in 
the lower left. Spikiness macrostates are shown in different colours. 
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spikiness Sp 

Figure 9.5 Microstate multiplicity of spikiness for three indistinguishable oscillators hold¬ 
ing nine quanta between them. This should be contrasted with Figure 4.4 for distinguishable 
oscillators. 



Figure 9.6 The phase space of three distinguishable oscillators with a constant total num¬ 
ber of quanta Q is shown as a triangle in a space with coordinate axes denoting the number 

of quanta in each oscillator q,. Microstates in this phase space can be categorised accord¬ 
ing to whether none of the qj are the same (green); two are the same (black); or all three 

are the same (white). For large Q, the green region dominates the total area, and the correc¬ 

tion for indistinguishability consists of dividing the multiplicity of the blue phase space by a 
factor of 6. 


There is an instructive way to understand how the sizes of phase spaces of distinguish¬ 
able and indistinguishable oscillators are approximately related. In Figure 9.6 the phase 
space of three distinguishable, harmonically bound particles with a fixed total number of 
quanta Q is shown as a blue continuum of points. The space is then shown divided into 
regions. Microstates where the three particles each possess different numbers of quanta, 
indicated by different coordinates cjj , are shown in green. Microstates where two of the 
qj are the same are shown in black, and there is one unique point where all three qj are 
the same. 

We have built a phase space for distinguishable oscillators, but now we seek to 
understand how it can be reduced in size to correct for indistinguishability. The num¬ 
ber of microstates £2 indist available to indistinguishable oscillators comes from merging 
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microstates in the six-fold repeated regions coloured green, and the three-fold repeated 
regions coloured black, such that 


o _ lo 

“indist 6 * ‘green 


3 ^ black 


Q. 


white ■ 


(9.16) 


For example, with Q = 9 we can deduce from Figure 4.3 that Q green = 42, Q black =12 
and fi white = 1 suc h that Q illdjst = 12 as shown in Figure 9.4. 

Large Q at a given N suggests a system at high temperature. In this regime, it is 
visually clear that phase space is dominated by the green region, and so we can make 
the approximation Q ind]St ~ (l/6)£2 green for a system consisting of three oscillators with 
a large number of quanta. For high temperatures, or equivalently classical conditions, 
it becomes a very good approximation to relate the sizes of the phase spaces of distin¬ 
guishable and indistinguishable oscillators by 


Q 


indist 


I Q 

6 “dist- 


(9.17) 


We now generalise the argument and make the claim that for the case of N oscillators, 
the correction factor is approximately 1 /N ! for high temperature, classical conditions. 


9.5 Partition Function of an /V-Particle Gas 


We return to the system of particles in a box and attempt to take account of particle 
indistinguishability in computing the partition function Z N■ For a set of N distinguishable 
particles in the box, the partition function Zy' st is indeed given by (9.15), but from the 
arguments given in Section 9.4, this involves an overcounting of microstates if the 
particles are indistinguishable. The particles are held in the same potential (the box); 
they are not in separate potentials such as atoms attached to specific lattice sites. 

The partition function in (9.15) is a sum of contributions where each of the N distin¬ 
guishable particles assumes one of the wavevectors in the single particle phase space. 
Let us consider instead a sum where all the particles are in different single particle states, 
which is clearly a smaller number: 

zf f = (2s + If ex P (- E ) < Z N iSt = z i ■ (9- 1 8) 


For the conditions of classical gas behaviour, namely, high temperature and low density, 
this sum makes the dominant contribution to the partition function. The argument is 
the same as the one we employed for the oscillators in the previous section: the sum 


Zy ltf is analogous to the green region of the three oscillator phase space, and as long 
as the phase space is large, the partition function of distinguishable particles Z, 


dist 


approximately given by Zy ff . Now, a specific set of N different wavevectors {k } will 


appear N ! times in Zy , corresponding to the ways in which N particle labels can be 
assigned to N wavevectors, and so Zy 11 is N ! times too large if the particles are in fact 
indistinguishable. Therefore, in order to take into account the indistinguishability, we 
divide this contribution by IV!. 


But since Zy 1 [l ~ Z“ st = Zy , the partition function of a classical /V-particle gas is then 


j dist 


to be written as 


r indist _ 


N 


= Z\) 


7 diff 


AM 


"L 
AM ’ 


(9.19) 
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which replaces (9.15) and is now inclusive of an approximate treatment of indistinguisha- 
bility. For low temperatures, we would need to find another way to construct a partition 
function to account for indistinguishability, and we shall return to this in Chapters 11 and 
12. But for now we investigate the properties of Z N for indistinguishable particles and try 
to establish connections with the treatment of the ideal gas in classical thermodynamics. 


9.6 Thermal Properties and Consistency 
with Classical Thermodynamics 


From the partition function we are able to calculate canonical averages of system prop¬ 
erties. First we evaluate the mean energy using (6.13): 


(E)=- 


dln Z N 

~w~ 


(9.20) 


Using Z N — Z\ /N\ together with (9.13) and (9.14), we find 


(E) = —N 


din Zj 

d/3 


din A.. 3 , 

N -^ 

d/3 


3 TV 
2~p 



(9.21) 


consistent with the calculation made in Section 6.1.1, and with the equipartition theorem 
as well. 

Next we evaluate the Helmholtz free energy. Assuming N is large, we can write 


InZ v &NlnZ l — N\nN + N = Nki(eZ l /N) = Ain 


/ (2s + l)eV \ 


V 


where e & 2.7183 is the base for natural logarithms, and so 


(9.22) 


F = —NkT In f (2 * + | = -NkTl n ( (2i + 




Nh 3 


(9.23) 


Therefore, from (8.11) the Gibbs entropy is 


S G — 


(E) F ( E) 


k\nZ N = -Nk + Mcln 


(2s + l)eV 


= Nk In 


(2s + l)e2(2jt77iAT)2 


(, N/V)h 3 


(9.24) 


This should be compared with the result (2.22) from classical thermodynamics. We have 
worked out the entropy of the monatomic ideal classical gas, according to the principles 
of statistical thermodynamics, and shown that it is compatible with the form derived on 
the basis of classical thermodynamics. Equation (9.24) is known as the Sackur-Tetrode 
expression for the entropy of a monatomic ideal gas in the classical regime of high 
temperature and low density. This is a key result, to demonstrate that statistical thermo¬ 
dynamics accounts quantitatively for a result derived in classical thermodynamics! 
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In (2.54), we showed that the chemical potential of a monatomic ideal classical gas 
was 


yt = kT\n 


NX, 


(2s + 1)V 


(9.25) 


and comparing this with (2.53) implies that the thermal de Broglie wavelength defined in 
(9.14), with s = 0, corresponds to the quantity introduced in our discussion of classical 
thermodynamics (where naturally spin was taken to be zero) in (2.52). Moreover, we 
can identify the constant c that appears in the classical thermodynamic expression for 
entropy in (2.22): we have 


(2s +1) \2tt m 


(9.26) 


This shows that c is related to Planck’s constant, a quantum concept. We could not have 
obtained this value within the framework of classical thermodynamics. 

Finally, we comment on the Gibbs paradox referred to in Section 9.3. If we had not 
divided by the N ! factor in (9.19), we would not have obtained the factor of N inside the 
logarithm in the Sackur-Tetrode expression (9.24) for the entropy. We could not have 
matched the form of the ideal gas entropy derived from classical thermodynamics. Even 
more seriously, the entropy would then not have been extensive, that is, it would not 
double when N and V are both doubled. Historically, this was a puzzle that was resolved 
only by recognising that the particles of a gas are indistinguishable, and that as a result, 
there are fewer microstates available than might otherwise be expected. 


9.7 Condition for Classical Behaviour 

We have stated several times that we expect intuitively that the classical regime of gas 
behaviour should correspond to high temperatures and low densities, but now we obtain 
a more precise condition. 

Recall that the key requirement for deriving the approximate form of the /V-particle 
partition function was that there was a dominant contribution to the partition function 
arising from microstates where single particle states in the box are occupied by at most 
one particle. We need a criterion to specify conditions where the probability of occupation 
of a single particle state by more than one particle is negligible. 

First, we use the Gibbs entropy to determine a rough estimate of the size of the phase 
space at the average system energy, using the argument in Section 8.2.2. From (9.24) 
we have 

/ (2s + l)eiy \ 

S G — Nk In ( -—- ) ^jtlnft^)), (9.27) 

\ N Kh / 

and so Q((E}) ~ (n q V/N ) n , where we have defined n = 1/k^. £2((E)) is analogous 
to the area of the blue triangle in Figure 9.6. Now, the argument that single particle states 
are rarely occupied by more than one particle rests on the idea that the system phase 
space is very large, that is, £2 ((E)) 1. This is analogous to arguing that the green 
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temperature T 


Figure 9.7 The quantum concentration n q (T) or its inverted form T q (n) separates the 
density-temperature plot of gas conditions into regions where classical and quantum behaviour is 
to be expected. 

regions of the triangle in Figure 9.6 are dominant. The condition holds when n V N, 

or when the particle density is much less than n q , namely, 


3 



(9.28) 


The temperature dependence of n is sketched in Figure 9.7. 

The classical regime corresponds to densities n well below n for a given temperature, 
or temperatures well above a threshold T — n 2 ^h 2 / (2iunk) for a given density. We 
interpret this to mean that the volume per particle should be much larger than the volume 
) 2 b in order that the gas should behave classically, in the sense that its thermodynamic 
properties should match those we obtained in Chapter 2. The particles need to be many 
thermal de Broglie wavelengths apart, on average. For a proton at room temperature, A th 
is about 0.1 nm; so such a gas would need to be quite dense to depart from classical 
behaviour. 

The root mean square velocity of a gas particle in the classical regime is v rms = 
(v 2 ) 1 / 2 = ( 3kT/m ) 1 ' 2 , and the quantum de Broglie wavelength of a particle at such 
a speed is h/(mv Tms ) = h/(3mkT) 1 ^ 2 , which is similar in form to A th in (9.14), and 
this accounts for the name. The criterion (9.28) involving A th therefore suggests that 
departure from classical thermodynamic gas behaviour is a quantum effect. It is partly 
so because at such densities the discrete nature of energy levels in the system becomes 
apparent, making the indistinguishability correction more complicated than the simple 
factor of l/Nl, but also because the non-negligible multiple occupancy of single particle 
states at high gas densities brings into the discussion some fundamental rules of quantum 
mechanics that we meet in Chapter 10. In any case, nonclassical behaviour emerges for 

_o 

particle densities greater than n = and n is known as the quantum concentration 
for this reason. 
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Exercises 

9.1 Show that the canonical partition function of a single atom with spin zero in a box 
of volume V is Zj = FA.^ 3 , where A th = [h 2 / (2ivnkT)] 1 / 2 is the thermal de Broglie 
wavelength. Estimate the thermal de Broglie wavelength of a sodium atom at room 
temperature. Calculate the partition function for a gas of IV 1 indistinguishable 

_o 

noninteracting atoms in the volume V, such that N/V <£ A th , and demonstrate that 
the Helmholtz free energy is extensive. Would this model apply to sodium vapour 
at room temperature? 

9.2 For a system of three oscillators holding four quanta, determine the number of 
microstates for cases where they are distinguishable or indistinguishable. 

9.3 Calculate the mean and standard deviation of the spikiness of a system of three 
indistinguishable oscillators that hold nine quanta, and compare the outcome with 
the case of distinguishable oscillators discussed in Section 4.3. 

9.4 Write down the partition function of N particles confined to a 1-d harmonic potential 
and in contact with a heat bath at temperature T, in the classical limit, given that the 
classical partition function of one particle in an oscillator of frequency <n is Z x = 
kT/fun. Show that the chemical potential of the system is given by kT\n(Nhco/kT). 

9.5 Two identical particles are tethered harmonically to points far apart and coupled 
to a heat bath such that the partition function might be written Zj 2 and the free 
energy —2kT\nZ { . On the other hand, if the tether points were brought together, 
then we should take account of indistinguishability and write the partition function as 
(l/2)Zj 2 and the free energy as —2kT\n Z, + kT In 2, such that the system free energy 
increases by kT In 2. Is this an energy change or an entropy change? [Hint: consider 
(6.13) or (6.15)]. Is an external force required to bring the particles together? [Hint: 
consider Section 3.2.2]. 

9.6 Under what circumstances might the atoms in a system be distinguishable and when 
might they be indistinguishable? 


10 

Quantum Gases 


We saw in Chapter 9 that a statistical thermodynamic treatment of a set of indistinguish¬ 
able noninteracting particles in a box gave us thermodynamic properties corresponding to 
those of a classical ideal gas as long as the particle density did not exceed a temperature- 
dependent quantum concentration n , defined in (9.28). When this condition is violated, 
the thermodynamic properties begin to depart from classical behaviour because the 
probability of occupation of at least some single particle states in the box becomes 
significant. When this happens, the correction for the overcounting of microstates due to 
indistinguishability has to be revised, and we also have to implement certain quantum 
mechanical rules regarding the multiple occupations of states. We describe these sys¬ 
tems as quantum gases, and the determination of their properties, while demonstrating 
that they correspond to real behaviour, provides considerable experimental support for 
statistical thermodynamic methods. 


10.1 Spin and Wavefunction Symmetry 

We resolved the Gibbs paradox in Section 9.5 by regarding particles as indistinguishable. 
It seems that it is meaningless to attach a label to a particular particle. Yet this is naturally 
the way both classical and quantum mechanics have been developed. For example, if we 
wish to describe the behaviour of a system of two particles (electrons in the helium atom, 
for example), then the standard approach is to solve a Schrodinger equation involving a 
Hamiltonian H(r l ,r 2 ), to obtain a wavefunction r//( r,, r 2 ) where the first position iq is 
the location of particle A and the second is the location of particle B. We take |i/r(r 1 , r-,) | 2 
to be proportional to the probability that particle A is located at position r, and particle 
B at r 2 . 

But the particles are indistinguishable, so the event that we have just mentioned cannot 
be meaningful. Instead, we must ask for the probability that an unspecified particle is 
found at position iq while the other is at r 2 . We ought to develop quantum mechanics 
without individual particle labels at all. But if we do introduce particle labelling into a 
description of indistinguishable particles, it must be in such a way that the choice of 
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labelling doesn’t matter to the physics. The probability that A is at r t and B at r 2 must 
be equal to the probability that B is at iq and A at r 2 . 

This invariance of the square modulus of the wavefunction under the swapping of 
particle labels is equivalent to an invariance under the swapping of particle positions, 
namely 

IV f (r 1 ,r 2 )| 2 = |i/'(r 2 ,r 1 )| 2 , (10.1) 

or more generally, for a wavefunction describing more than two particles: 

|^(r 1 ,r 2 ,---)| 2 =l^(r 2 ,r 1 ,---)| 2 , ( 10 . 2 ) 

where the dots denote additional particle positions. This means that 

i/Tri,r 2 ,--- ) = e If V(r 2 ,iV-- ), (10.3) 

and a second swapping implies that (r 1; r 2 , • • • ) = e 2lS i/r(r 1 , r 2 , • • • ) indicating that 
exp(2i0) = 1, or 6 — 0 or it. The so-called exchange symmetry therefore leads to two 
possibilities: the wavefunction can be either even or odd under the swapping of labels: 

^ ( r i- r 2 , • • • ) = (r 2 , r lt ■ ■ • ). (10.4) 

It is a remarkable fact that the choice of symmetry (+ sign) or antisymmetry (— sign) of 
such a wavefunction under the exchange of labels is connected to the spin of the particle. 
The spin-statistics theorem states that the wavefunction of particles with integer spin must 
be symmetric under exchange of particle labels, and that of particles with half-integer 
spin must be antisymmetric. Integer spin particles are known as bosons, and half-integer 
spin particles are called fermions, in recognition of this fundamental difference. The spin- 
statistics theorem arises from considering the very meaning of particle spin in quantum 
mechanics, and is beyond the scope of this book. 


10.2 Pauli Exclusion Principle 

The antisymmetry requirement for fermions means that only one particle can occupy 
each single particle state. This is the familiar Pauli exclusion principle in atomic physics 
that limits the occupation of atomic orbitals to one electron (a spin half fermion) of each 
spin orientation. Let us see how this arises. 

Consider two electrons in a helium atom, each with the same orientation of spin. 
The spatial part of their wavefunction \^(r l ,r 2 ) possesses an antisymmetry under label 
exchange, such that \l/(r l ,r 2 ) = ~\!/(r-,,r t ). It is immediately apparent that the proba¬ 
bility \\[r (r 1; r j )| 2 that the two electrons should be found at the same position is zero. 
The electrons appear to avoid each other, for a reason that has nothing to do with their 
electrostatic repulsion, but instead is due to the antisymmetry of their joint wavefunction. 

Now imagine constructing a wavefunction of two noninteracting fermions in a box 
by forming a product of two single particle states, as we claimed was possible in 
Section 9.1. We consider single particle states with wavevectors k and k' such that 
l/ r kk'( r i> r 2) = ( r l)’’Ak'( r 2 ) where ^(iq) oc sin(k • iq) and so on. However, this 
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expression is not antisymmetric in particle label. For particles with the same spin 
orientation, a wavefunction with acceptable exchange symmetry has the spatial part 

V f kk'( r t- r 2) « V f k ( r i) 1 M r 2) - !M r 2)lM r t)- ( 10 -5) 

Clearly, the wavefunction for a situation where both particles have the same wavevector 
vanishes. A second fermion is excluded from a single particle state that is already 
occupied. Furthermore, V / kk'( r i, r, ) = 0 and the two particles avoid each other spatially, 
as in helium. Our understanding of atomic structure rests upon this principle, and the 
same ideas apply to the statistical thermodynamics of the fermion gas. Equation (10.5) 
demonstrates explicitly that what we might have thought were two microstates actually 
corresponds to just one. 

We shall pursue the consequences of this in Chapter 12, but first we should consider 
exchange symmetry in the boson gas. The wavefunction of two particles needs to be 
symmetric in label exchange, and so t/ r kk'( r 1 , r 2 ) oc ir k (r l )\l/ k /(r 2 ) + V f k( r 2 )V f k'( r i) is 
the admissible combination. The wavefunction does not vanish when both particles 
are assigned the same wavevector. Bosons do not avoid each other: t/ r kk'( r i> r i) ^ 0. 
In fact, a single particle state can be occupied by any number of particles, leading to 
interesting nonclassical behaviour when the density exceeds the quantum concentration, 
as we shall see in Chapter 11. 


10.3 Phenomenology of Quantum Gases 


We have mentioned that nonclassical effects should emerge for gases that are denser 
than the quantum concentration, and it is important to estimate the necessary conditions 
for various example systems. The quantum concentration n q is given by 


n 




2nmkT 

h 2 


3 

2 


( 10 . 6 ) 


and its magnitude can be reduced, and therefore made more attainable, if we consider a 
low temperature or a particle with a low mass. If we insert the proton mass, we get n q ~ 
2 x 10 26 r 3 / 2 m ’. where T is in kelvin. Atomic or molecular gases at room temperature 
and pressure have a particle density of about p/kT, which is of order 10 26 m ’. So, if a 
gas in a closed box at such a density were cooled, nonclassical effects might be expected 
at temperatures below about 1 K. By then, however, it will most likely no longer be a 
gas, having condensed into a liquid or a solid. 

Nevertheless, the boson liquid 4 He, with a particle density of 2 x 10 28 m 3 , exhibits 
some rather amazing phenomena at a temperature of around 2.17 K. The most striking 
effect is that the liquid appears to lose most of its viscosity, giving rise to strange flow 
behaviour, such as an ability to pass through tiny pores, or to creep up surfaces drawn by 
capillary action, unimpeded by viscous drag. The phenomenon is known as superfluidity. 
The liquid also appears to have an anomalously high thermal conductivity, and has a 
peak in its heat capacity in this temperature region. Might this be an effect that can be 
explained by the statistical thermodynamic properties of a quantum gas? 

More recently, similar peculiar phenomena have been detected in atomic gases, rather 
than in liquids, trapped in small quantities and cooled to very low temperatures in 
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various ways. The signature, in this case, is a departure from the Maxwell-Boltzmann 
distribution of atomic speeds. The statistical behaviour of the particles differs radically 
from that expected of a classical gas. Again, might this be a quantum effect? 

It is tempting to ascribe other unexpected phenomena of macroscopic systems at low 
temperature to quantum gas properties. One of the most dramatic of these effects is 
the loss of electrical resistance of many materials below a transition temperature: the 
phenomenon of superconductivity. This is accompanied by peculiar magnetic effects 
whereby the material appears to be unable to accommodate magnetic flux, leading to 
repulsion from magnetic fields, the so-called Meissner effect. The temperature for the 
superconducting transition varies over a wide range for different materials, and great 
technological gains would be available if a material with room temperature supercon¬ 
ductivity could be discovered. Is this something to do with a gas exceeding its quantum 
concentration? 

The final example might come as a surprise, but evidence would suggest that we 
interact every day with a quantum gas at room temperature. The Drude model of the 
behaviour of conduction electrons in a metal is based on the idea that they act like a gas, 
in that they are free to move around their container (the metal) and suffer relatively few 
collisions with each other or with the atoms in the sample. When driven by a potential 
gradient, they flow like a gas to create an electric current. It has been proposed that this 
freedom of transport can also account for the high thermal conductivity of metals. So, 
what might be the quantum concentration of an electron gas in a metal? From (10.6), and 
using the electron mass, we arrive at an estimate of n ~ 2 x 10 21 7’ 3/2 m 3 . In a metal, 
the concentration of conduction electrons is about one per atom, or about 10 28 m 3 . 
The threshold temperature below which we might expect nonclassical thermodynamic 
behaviour is therefore about 10 4 K! At room temperature, the electron gas in a metal 
is nonclassical. So might this explain anything about them? Well, the Drude model, 
introduced in 1900 before the advent of quantum mechanics, had a significant problem. 
If the electron gas in a metal were classical, then it would contribute a heat capacity of 
(3/2 )k per electron, and this is simply not observed. Might this be a quantum effect? 
We shall address these questions in the next two chapters. 


Exercises 

10.1 What physical property determines whether a particle is a fermion or a boson? What 
values of this property are distinctive for bosons? State an important symmetry 
property that must be satisfied by the wavefunction of a system of many bosons. 
What is the corresponding property that must be possessed by the wavefunction of 
a system of many fermions? 

10.2 The exchange symmetry of a wavefunction of two fermions suggests that the prob¬ 
ability of their being found at the same spatial point is zero. Interpret this in terms 
of an effective interaction force between the particles. 

10.3 The muon is an unstable elementary particle analogous in many ways to the electron 
but 200 times heavier. If we could replace all the free electrons in a metal with 
muons, would the result be a classical or a quantum gas? What (brief!) difference 
would it make to the heat capacity of the metal? 


11 


Boson Gas 


Bosons are particles with integer spin, named in honour of Satyendra Nath Bose 
(1894-1974), who, along with Einstein, developed the main ideas behind the statistical 
thermodynamics of a boson gas. We saw in Chapter 10 that quantum mechanics imposes 
no limit on the number of bosons that can occupy the same single particle energy level 
in a system of noninteracting particles. The population in each energy level is controlled 
by the prevailing temperature through the Boltzmann factor and so at low temperature, 
multiple occupancy of some low-lying energy levels is expected to be significant. In 
these circumstances, we cannot follow the procedure employed in Section 9.5 to con¬ 
struct the /V-particle canonical partition function Z N . We used the one-particle partition 
function Z, to build a partition function Z ; y' st = Zj v for N distinguishable particles and 
then took approximate account of the indistinguishability by dividing by AM, a step 
that presumes that multiple occupancy of single particle states can be neglected. If we 
cannot rely on this assumption, the correction factor is not available in a simple form. 

Instead, the best way to explore the properties of the boson gas is through the grand 
canonical ensemble. We consider first the statistical thermodynamic properties of a sys¬ 
tem consisting of a single quantised standing wave state in k-space, and then deduce the 
properties of an entire collection of such single particle states. 

11.1 Grand Partition Function for Bosons in a Single 
Particle State 

The grand canonical ensemble that we developed in Chapter 7 is a treatment of a system 
able to exchange energy and particles with a reservoir at temperature T and chemical 
potential /i. All the statistical properties are to be derived from the grand partition 
function, which from (7.14) is given by 



( 11 . 1 ) 


microstates i 


The sum is over all microstates of the system in a volume V, each characterised by an 
energy E i and a number of particles /V,. 
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We shall apply this to a system consisting of a single particle state at energy e. The 
energy of a microstate is therefore E i = N t e, such that each microstate is fully specified 
by the number of particles held, in which case we get 


Zq(p, V , T) = J]exp 

Ni= 0 


(p - e)N j \ 
kT ) 


1 

1 -exp(^)’ 


( 11 . 2 ) 


having evaluating the sum as a geometric series. The superscript indicates that this is 
the grand partition function of the single particle state at energy e. The reservoir that 
provides energy and particles is essentially the remainder of the boson gas, distributed 
across all the other single particle states in the container, and presumed to act as a heat 
and particle bath with fixed temperature and chemical potential. Equivalently, we can 
instead imagine a more abstract external source of energy and particles. 


11.2 Bose-Einstein Statistics 


From (7.16), the mean population in the single particle state at energy e, also known as 
the occupation number, is 


(N) e = kT 


9 In 
dp 


T,V 


1 

ex p(^)-!’ 


1 — exp 



ex P(V) 

[i-exp ^)] 2 


(11.3) 


using Z' ( , from (11.2). This expression is known as Bose-Einstein statistics. It has several 
interesting features. 

If the chemical potential of the reservoir increases, while its temperature remains 
constant, the mean population in the single particle state rises; explicitly 


d(N)A = 1 exp (^) 
dp ) T,v kT ^ eX p — l)“ 


(11.4) 


which makes physical sense. However, the reservoir chemical potential cannot be raised 
above the energy of the state e, in order that the denominator in (11.3) should never be 
negative (mean populations must be positive). In the limit that p e from below, the 
mean population can become very large, which of course is allowed for bosons. 

Now recall that our strategy is to regard the standing wave single particle states as 
independent systems in contact with the same particle reservoir. These have energies E k 
given by (9.5) varying from 3h 2 /(8mV 2 ^) (which is very small when V is large) up 
to infinity. We conclude that the chemical potential of the particle reservoir has to be 
less than 3h 2 /(8mV 2 ^), or roughly speaking never positive. This is slightly strange, and 
we’ll return to the implications later. 

Considering (11.3) with p < e, the mean populations in the single particle states 
decrease as the state energy e increases, for a given p and T, as illustrated in Figure 11.1. 
The singularity in the Bose-Einstein statistics expression lies safely in the unphysical 
region of negative e. 
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e/kT 


Figure 11.1 Bose-Einstein occupation number of states with (positive) energy e, illustrated for 
two (negative) values of reservoir chemical potential /x and a fixed temperature. The dashed lines 
are continuations of {N) f for unphysical negative values of e, to illustrate the singularity at e = /x. 


A further feature of interest is that as the temperature of the reservoir is increased, at 
constant chemical potential, the mean population in the state again increases; explicitly 


(d(N)A exp(^) 

V dT )n,v kj2 (exp(^)-l) 2 


(11.5) 


The picture we have, therefore, is that the mean populations in a set of single particle 
states in a box can be increased or decreased by varying the external reservoir parameters 
/x and T. The sum of the occupation numbers of all the single particle states, divided by 
the volume of the box, then corresponds to the mean particle density n = (AQ e /V, 
and we can presume that the chemical potential of such a gas will match that of the 
reservoir since the two will be in equilibrium. 

It is quite illuminating to insert the expression in (9.25) for the reservoir chemi¬ 
cal potential appropriate to the classical gas regime (with spin zero particles), namely 
/x = kT\n(n/n q ) with n <£ n q , into (11.3). For ix/kT 0 < e/kT such that exp [(e — 
fj.)/kT] » 1 we get 



demonstrating that the likelihood of multiple occupancy of any of the single particle 
states is very small if the density is much less than the quantum concentration n . 
Quantum gas thermodynamic properties emerge when the reservoir parameters are such 
that the mean density of particles in the system exceeds n q . Equation (9.25) and Figure 
11.1 suggest that this would involve the approach of /x towards zero from below, such 
that the single particle states with lowest energy become multiply occupied. But the 
relationship between /x and n will then no longer be given by the classical expression, 
and the gas will deviate from classical behaviour, which we explore next. 








7 58 Statistical Physics: An Entropic Approach 


11.3 Thermal Properties of a Boson Gas 


The picture that we have established, of single particle states forming a cubic array 
in k-space, and in equilibrium with an external reservoir according to Bose-Einstein 
statistics, is remarkably similar to the filling of crystalline lattice sites with vacancies 
drawn from an external vacancy reservoir, discussed in Section 7.3, though there is an 
important change in sign in the denominator of the expression for the mean population at 
each site when comparing (11.3) with (7.18). We can pursue this analogy and construct 
the grand partition function for the gas as a product of grand partition functions for each 
of the single particle states labelled by k,: 

Z G (H,V,T)= jr exp(-A kl ^-^) f] exp(-iv k2 ^^)--- . (11.7) 
AS=0 V 7V k2 =0 V 

The microstates are labelled by the set of populations {N k .} across the array. Notice that 
this avoids assigning labels to any of the particles such that we can be sure that the 
result will be appropriate for indistinguishable particles. 

The grand partition function of the gas is therefore 

z c =rK k ', < il8 ) 

k, 


6k. i 

with Z G ' given by [1 — exp [(p — e k .)/kr]] according to (11.2). This is equivalent 
to lnZ G = ^ k; lnZ G k ', which can be written in the form lnZ G = ^ e f2(e)lnZ G where 
the sum is over the energies of single particle states and f2(e) is the multiplicity of 
such states at a given energy e. As the single particle energies lie very close together, 
we might consider representing the sum as an integral over a continuous single particle 
energy e, denoting the multiplicity of single particle states in the range e —»■ e + de as 
g(e) de, where g(e) is a density of states as discussed in Section 9.2. We write 


In Z, 


f 


g(e)\nZ € G de = 


-f 


g(€) ln 


1 — exp 




kT 


de. 


(11.9) 


The average number of particles in the g(e)de single particle states within the energy 
range e -> e + de is equal to {N) e g(€) de, with (N) e = (exp[(e — p)/kT] — l) -1 
according to (11.3). Thus the mean number of particles in the system, across the entire 
set of single particle energy states, is 


ro 

m = =I>(e)<AO e * / 

V. ' c J® 


g(e)(A) f de. 


( 11 . 10 ) 


k, 

This can be obtained more formally from (11.9) using 


(N) = kT 


ain Z G 

dp 


T,v 


= kT f 

Jo 


g(e) 


3 In zi 
dp 


de = 


T,V 


-jf 


g(e)(N),de. (11.11) 
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Similarly, from the definition of Z G in (11.1), we can write 


(E - fiN) = kT z 


3 In Z c 
3 T 


= kT■ 


-f 


H,v 

g(e)(e - /x)(7V) e de, 


f 


g(e) 


3 In Z e c 
3 T 


de 


n,v 


( 11 . 12 ) 


where we have used 


kT~ 


3 In Zl 
3 T 


kT 2 [~ ex p(-V). 


(e - /x) 


n,v 


[l -exp(-'y^) 

= (e - v)(N) e . 


2 kT 2 


/x 


exp (^)- 1 

Comparing (11.11) and (11.12), this is consistent with our requirement that 


r° 

w = ~/ 

L-. c *^0 


g(e)e(N) € de, 


(11.13) 


(11.14) 


on the basis of the intuition that the total mean energy is the sum of mean energies 
e(/V) f for each single particle state. 

Inserting (9.11) and (11.3) into (11.10) we then obtain 


n (/x, T ) 


_ ( N) _ (2s + l)/2my r°° eide 

V (2n) 2 \ H 2 ) J 0 exp (^) - 1 ’ 


(11.15) 


for the mean density of the gas at a given chemical potential and temperature. If this could 
be inverted, it would give us the chemical potential fj,(n,T) of the gas in terms of the 
mean number of particles in the system at a given temperature, namely a replacement for 
the classical expression /x = kT ln[/z /n (7’)] appropriate for spin zero particles. However, 
this is not straightforward. Nevertheless the classical limit may be recovered under the 
appropriate assumption e — /x kT, or very negative /x, and s = 0, such that 
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/ 2m \ 


(2jt) 2 \h 2 J Jo 


(2tt) 2 
= 2jt 


1 / 2mkT \ 2 


h 2 


H 2 
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r ° i 

/ 62 

Jo 

ftt 
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(e - /x) 


kT 

Z 2 exp (-z 2 )dj 


de 


exp 


\kT> 


2mkT \ 2 7T 2 


exp 




(11.16) 


having used the substitution z = (e/kT) 1 ' 2 , such that /x « kT\n(n/n ) as before. 
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The mean energy of the gas per unit volume, for a given p and T, is written 


, __ (E) _ (2s + l)/2m\ 
£ V (2tt) 2 \ h 2 ) 


f 


€2de 


e x p(y £ )-r 

and in the classical limit with s = 0 this specific energy becomes 


(11.17) 
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z 4 exp (-z 2 )dz 


exp 7^ 


\kT> 


2mkT V3tt2 / P \ 3 

— exp l = ~ 


\kT 


j = —nkT, 


(11.18) 


just as we would expect for an ideal gas, having inserted p = kT \n(n/n g ). 

The Gibbs entropy of the gas is obtained from the grand partition function through 
the relation (8.17): we write 


TS g = (E) — p(N) + kT lnZ G , 
such that according to (11.12) and (11.9) we get 


(11-19) 


S G = 


-f£ 

ts: 


g(e)(e - p){N) ( dc+k 
€ — 11 
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g(e) 




g(e)lnZ G de 
— kT In ( 1 — exp 


P 


kT 


de, (11.20) 


which can be shown to reduce to (2.54) in the classical limit where e — p kT, namely 
an entropy per unit volume s = S G /V (not to be confused with spin, nor with the specific 
entropy S/N introduced in Section 3.9.2) that goes to s cl = nk( 5/2 — p/kT), or 


s cl = nk In 


eI n q (T) 


= -nk In | 


e 5 / 3 T 

T J n \ 


( 11 - 21 ) 


returning it to the form derived in (2.22) or the Sackur-Tetrode version (9.24), with spin 
zero particles, and having inserted 


h~n 5 

TAn) = -. 

9 2tt mk 


( 11 . 22 ) 


Finally, we consider the pressure of the gas. This is straightforward to obtain from 
the grand partition function when we write 


91nZ G 

8V 


t,h 


p = kT 


(11.23) 
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which is an analogue of (8.22) for the grand canonical instead of the canonical ensemble. 
From (11.9) the only V'-dependence in In Z G arises from the factor of V in the density 
of single particle states g(e) defined in (9.11), so 



which has the classical limit p —> nkT. 

These expressions for the principal thermodynamic properties of the gas are slightly 
forbidding, but they yield understanding when we choose a chemical potential and tem¬ 
perature and evaluate the associated particle, energy and entropy densities n, e, s and the 
pressure p. An example is given in Figure 11.2, where the temperature is fixed and fi 
is varied. Classical behaviour is recovered when e — p. kT such that e —*■ (3/2 )nkT 
and p —»■ nkT while the entropy density tends towards the form ,y c i = nk (5/2 — fi/kT). 
All these quantities start to deviate from classical behaviour when the particle density 
approaches and exceeds the quantum concentration. 

Nonclassical behaviour can also be exposed by eliminating /r between e(/i, V , T) and 
n{p,, V , T), for example, to give the energy density e(n, V , T) in terms of more familiar 
variables. As an illustration, we plot e and .v against T at constant n in Figure 11.3 for spin 
zero particles, and normalise the results by the classical expressions e cl = (3/2) nkT and 
,v c i = nk \n(iP' 2 n q /n). To a certain extent, the behaviour looks familiar. The classical 
mean energy is not appropriate for temperatures below T — h 2 n 2 ^/(2nmk). This is 
the threshold temperature for quantum effects that was discussed in connection with 
Figure 9.7. The entropy is also reduced, which brings to mind the freezing out of degrees 
of freedom in the low temperature thermal behaviour of a diatomic gas, or of the Einstein 
solid, though the reasons for the deviation here are rather different. 

However, these expressions harbour a very serious problem, and simply cannot be 
entirely correct. This is partly illustrated by the fact that the energy and entropy are not 
reported below a temperature T « Q.5T q in Figure 11.3. We investigate this problem in 
the next section. 


11.4 Bose-Einstein Condensation 


The nonclassical modifications to the thermal properties of a boson gas that we saw in 
the previous section as the temperature is reduced or chemical potential increased are 
perhaps not dramatic enough to explain the strange low temperature phenomena that 
were discussed in Section 10.3. But there is a flaw hidden within (11.15), in particular, 
and by investigating it we shall discover a rationale for such behaviour. 

The density of particles in the gas derived in (11.15) may be rewritten as 


2(2s + 1) 



x 2 dr 


n 


1 

7T 2 


ex P (~w) ex P (*)- 1 


(11.25) 
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showing that the density increases as the (negative) chemical potential increases; explic¬ 
itly 

(?A = 2 f*t ,? n,r ( 11 . 26 ) 

\dp) T y kTn I- Jo (exp (— exp(x) — l) 

The problem is that we concluded in Section 11.2 that the chemical potential has an 
upper limit of (approximately) zero, such that the integral in (11.25) has an upper limit 
of f Q x 1//2 (exp (x) — l) _1 dx ^ 1.3067T 1 / 2 . For lower (i.e. negative) p, the denominator 
in the integrand is larger than (exp(x) — 1) for all x > 0 and the integral is therefore 
smaller. This suggests that the mean particle density of the system must satisfy 

n < « max (T) = 2.612(2* + 1 )n q (T), (11.27) 

and this upper limit is apparent in the plot of n/n q for spin zero particles in Figure 11.2. 
Does this mean that there is a temperature-dependent maximum number of particles that 
can be accommodated in the box? What happens if we try to pump in some more? Or 
more tellingly, what happens if we have a certain density of particles n, then shut off 
their exchange with the reservoir and reduce the temperature until « max (T) falls below 
n, or equivalently when T < 0.5 T q in Figure 11.3? The particles cannot disappear! 

Fortunately, the flaw in the development is easily identified. The approximation (11.10) 
is the root of the problem. An integral representation of the sum of mean populations 
over all the single particle states is an inadequate treatment for the very lowest states. 
The square root density of states (9.11) is a poor approximation at low e. Recall that it 
was derived in Section 9.2 by counting standing wave states in a thin shell in k-space, 
on the assumption that there are many such states to count and the granularity of the 
lattice of allowed states did not matter. But as the gas approaches the quantum regime. 



Figure 11.2 Properties of a gas of spin zero bosons as a function of the chemical potential 
at constant temperature. As p approaches zero from below, the particle density increases, and 
it exceeds the quantum concentration for p > —OAkT. The energy per particle then begins to 
deviate from the classical value (3/2 )kT, the pressure falls below nkT and the entropy per particle 
deviates from the classical result £(5/2 — p/kT). 
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Figure 11.3 Energy and entropy of a gas of spin zero bosons, per unit volume, divided by 
the classical expressions, as a function of temperature, illustrating quantum gas behaviour when 


T < T . 

~ 9 


the mean populations of particles in the low energy states become significant and an 
approximate treatment will not do. 

In order to remove this deficiency, we recognise that the integral of populations over 
the model density of states accounts for only some of the particles. The populations in 
the lowest energy states should be represented explicitly. In the simplest treatment, we 
consider just the population in the lowest energy state e = 3h 2 /8mV 2 ^ ~ 0. This solves 
the problem of the apparent disappearance of particles when T falls below 0.5 T : the 
particles that seemed not to find a home are actually accommodated in this state. The 
mean population in the single particle ground state at e 0 is therefore given by 


(N )o = (N) - Vn mm (T), 


(11.28) 


for temperatures where n mllx (T) < n, a condition that corresponds to 


T < 


( n \ 

| f h2 1 

f 1 ^ 

\2.6l2(2s + \)) 

1 2tt ink 1 

^2.612(25 + 1) ) 




(11.29) 


defining a temperature T c , approximately equal to 0.5 T for a spin of zero. Since « max oc 
n q oc T 3 ' 2 , the relative proportion of particles occupying the ground state is given by 


W 0 = 1 "maxCQ = 1 /ry 

(N) n \T C / 


(11.30) 


for T < Tq . The ground state appears to accumulate enormous numbers of particles when 
the temperature is decreased below T c , and as T -> 0, all the particles fall into that state. 
The fractional occupation of the ground state as a function of temperature is illustrated 
in Figure 11.4, together with the proportion of particles that are not in the ground state, 
written {N)_ i0 /(N). This is multiple occupancy taken to an extraordinary degree. 

The model now makes perfect sense. At a temperature of absolute zero, it satisfies our 
expectation that the particles should take the lowest available energy level. This would 
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Figure 11.4 The mean number of particles of a boson gas occupying the ground state ( N) 0 
and occupying all other states (N)^ 0 , as a proportion of the whole population as a function of 
temperature. T c is the Bose-Einstein condensation temperature. 


also be consistent with the third law: because there is just a single microstate available 
in this limit, the entropy of the gas will be zero. However, what is perhaps not expected 
is that there is a distinct temperature below which a gross ‘condensation’ of particles 
into the lowest energy single particle state begins. The word condensation suggests a 
phase transition of some kind, and indeed it is rather like that: the gas switches rapidly 
from a phase with properties that are a distorted version of those of a classical gas, to a 
phase that is altogether different. The phenomenon of the mass occupation of the lowest 
energy single particle state of the system is known as Bose-Einstein condensation and 
T c is called the Bose-Einstein condensation temperature. 

We can illustrate the change in behaviour by sketching the entropy of the gas as a 
function of temperature at constant density in Figure 11.5. The entropy takes its classical 
logarithmic dependence on T for T T c , but this cannot remain valid as the temperature 
is reduced since it would imply that the entropy eventually became negative. Instead, 
from Figure 11.2 we see that the entropy falls to approximately 1 ,3k per particle at 
p £3 0, or equivalently at T = T c , and then the entropy continues towards zero as the 
temperature is reduced below this threshold in proportion to the number of particles 
(N)^ 0 that do not lie in the single particle ground state. The population in the ground 
state (N) 0 contributes zero entropy. It is as though the gas separates into two components 
with very different thermodynamic properties. 

Figure 11.6 illustrates how the chemical potential is logarithmic in temperature in 
the classical regime T T c , taking the form p ~ —(3/2)kTln(T/T ), but approaches 
zero asymptotically from below for T < T c . Further implications are that the energy of 
the gas is proportional to T in the classical regime, but is considerably suppressed for 
temperatures below T c as a result of particle condensation into the e ~ 0 ground state. 
The gas pressure is also suppressed. 

So what can we say about a gas in which a considerable fraction of the particles reside 
in the lowest standing wave quantum mechanical state? The first thing we must abandon 
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Figure 11.5 Entropy per particle s/n for a gas of spin zero bosons as a function of temperature 
at constant density, illustrating its deviation from the classical Sackur-Tetrode expression as the 
temperature falls below the Bose-Einstein condensation temperature T c . 


T/T q 



Figure 11.6 Temperature dependence of the chemical potential of a gas of spin zero bosons, at 
a fixed density, showing the change in behaviour at the Bose-Einstein condensation temperature 
T c~0.5T r 


is the classical picture of particles moving around in a chaotic fashion and colliding 
with one another. The picture provided by quantum mechanics is one where particles 
are moving (slowly!) backwards and forwards between the walls and never seeming to 
interact with each other. It is hard to get a feel for such a gas, but there ought to be at 
least two clear signatures: firstly that the distribution of particle speed is very narrow, as 
most of the particles are occupying the same energy level, and secondly that the viscosity 
of such a fluid ought to be very low, as the particles rarely transfer momentum to each 
other. Resistance to the shearing of a fluid fundamentally arises from collision processes 
on a microscopic scale that spread out the effects of an external impulse. 
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It is a triumph of the model that these are two of the features of superfluids, such 
as liquid 4 He and trapped atomic gases at densities above the quantum concentration, 
that we discussed in Section 10.3. At low temperatures, boson gases adopt a new 
state of matter with a macroscopic degree of correlation in the quantum states of the 
component particles. 

But what about the phenomenon of superconductivity? A lack of collisions between 
current-carrying particles in a material sounds like a perfect scenario for explaining zero 
resistivity, and it is made even more appealing by observing that it sets in rather abruptly 
below a threshold temperature, as is found with superfluidity. There is just one problem. 
The model we have developed applies to bosons, and the current-carrying particles in a 
conductor are supposedly electrons. And they are spin half fermions, aren’t they? 


11.5 Cooper Pairs and Superconductivity 

When might a gas of fermions behave like a gas of bosons? Constructing an argument 
to allow this to happen might allow us to regard superconductivity as a Bose-Einstein 
condensation phenomenon. It has been shown that electrons in a material feel a weak 
attractive force towards one other, caused by their distortion of the arrangement of the 
surrounding atoms. In classical physics terms, a mobile electron pulls surrounding ions 
of the material towards it, creating a slight excess of positive charge in its neighbourhood 
compared with the situation in the absence of the electron. The distortion produces an 
attractive electrostatic well for a second mobile electron, and in spite of their mutual 
repulsion, the two electrons tend to associate, if rather loosely, into a state known as a 
Cooper pair. The interaction is nicely pictured in quantum mechanics as the emission of 
a quantum of lattice vibrational energy, known as a phonon, by one electron, followed 
by its absorption by the second electron. 

The attraction provides a rationale for supposing that some electrons pair up as com¬ 
posite particles that can be regarded as bosons, since the addition of two half integer 
spins would make an integer spin. This is analogous to the formation of a helium atom, a 
boson, from electrons, protons and neutrons, all fermions. So part of the electron gas (in 
the presence of a suitable background medium) can convert itself into a gas of bosons at 
sufficiently low temperature. Then the condensation of the bosons into the lowest energy 
single particle state can proceed, giving rise to a quantum gas of Cooper pairs, each of 
which can carry an electric current with little or no collisional resistance, and we have a 
scheme for explaining superconductivity. Furthermore, the ease with which electric cur¬ 
rents flow in superconductors means that exposure to a magnetic field elicits an extreme 
response that has the effect of cancelling the magnetic field within the material. The 
effective repulsion of magnetic flux is the Meissner effect. 

The details of this so-called BCS mechanism of superconductivity, named after John 
Bardeen (1908-1981), Leon Cooper (1930-) and John Schrieffer (1931 —), are much 
more elaborate than suggested by the above sketch, and understanding the mechanism 
relies to a considerable extent on an appreciation of the properties of the underlying 
fermion gas, which we discuss in Chapter 12. The main message to absorb here is that 
bosons can be composites of fermions (but not vice versa) and that a theory based on 
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this idea does appear to explain the behaviour of at least some types of superconductor. 
In short, it is another triumph of statistical physics! 


Exercises 

11.1 Four bosons are held in a harmonic potential that has single particle states of 
energy e = n ha, where n = 0, 1,2 and so on. (a) Indicate the pattern of particle 
occupation of states, and total energy, of the first five microstates of the system 
(ordered by energy) if the bosons are distinguishable, (b) Indicate the particle 
occupation and total energy of the first seven microstates of the system if the 
bosons are indistinguishable. 

11.2 Derive the classical limits of the entropy and pressure of a boson gas starting with 
the expressions given in (11.20) and (11.24). 

11.3 Sketch the energy per particle of a boson gas as a function of temperature. 

11.4 Considering the temperature dependence of the entropy of a boson gas shown in 
Figure 11.5, or that of the energy per particle considered in question 11.3, sketch 
the heat capacity of the gas as a function of temperature. What feature appears in 
the vicinity of T = 7 C ? 

11.5 At what temperature would you expect a trapped gas of rubidium atoms with 
density 10 19 m 3 to show signs of Bose-Einstein condensation? 

11.6 If the mass of a Cooper pair is twice that of an electron, estimate their density in 
a material for which superconductivity sets in at 10 K. 
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Fermion Gas 


Half integer spin particles are known as fermions in honour of Enrico Fermi 
(1901-1954). At low temperature or high density, we might naturally expect the 
properties of a gas of noninteracting fermions to differ from those of a classical gas, 
and we explore these quantum effects in this chapter. 

We have made the point already that the classical gas model is founded on the assump¬ 
tion that the vast majority of contributions to the partition function arise from microstates 
where there is no multiple occupancy of the same single particle states. But fermions 
satisfy the Pauli Exclusion Principle, and we might wonder why the classical model 
developed in Section 9.5 does not apply to a fermion gas in the quantum regime. The 
problem is that the partition function for distinguishable particles Z* st = Z| v , upon which 
the classical model of a gas is based, where Z, is the partition function of a single par¬ 
ticle in the box, implicitly contains terms where the particles occupy the same single 
particle state; so it will not do. We need to develop a more appropriate model, along 
lines similar to those developed for bosons in Chapter 11. 


12.1 Grand Partition Function for Fermions in a Single 
Particle State 


Once again, we construct a grand canonical ensemble for a system consisting of a single 
particle state at a specified energy. The grand partition function is in general 



( 12 . 1 ) 


microstates i 


where /i and T are parameters of the reservoir. For a single particle state at energy e, 
the microstate energy is given by E t = N t e as for bosons. The occupancy of the state 
Nj, however, is now strictly zero or unity, according to the Pauli Exclusion Principle, 
and so 



( 12 . 2 ) 
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This allows us to calculate the statistical properties of the single particle state, and by 
extension, the entire fermion gas. 


12.2 Fermi-Dirac Statistics 


We proceed as we did for bosons in Section 11.2. We calculate the mean population, 
or occupation number, for the single particle state as a function of reservoir chemical 
potential and temperature: 


(N) e = kT 


9 In Z* 
dp 


T,V 


ex P (V) 

1 + exp () 


1 

exp(^) + f 


(12.3) 


This result is known as Fermi-Dirac statistics or the Fermi-Dirac function, in recogni¬ 
tion of the seminal contributions of Paul Dirac (1902-1984) to the theory of fermions. 
The difference in sign in the denominator with respect to the Bose-Einstein version in 
(11.3) is extremely important. 

The occupation number as a function of state energy is illustrated in Figure 12.1. On 
the left, we also see the variation in mean population as p is increased. The situation 
for /x = —kT is not too dissimilar to the Bose-Einstein occupation number for the same 
conditions in Figure 11.1. However, there is now no reason why p cannot exceed zero, 
and for cases p — 5 kT and 1 OkT, we see the progressive filling of the single particle 
states up to the capacity of unity set by the Pauli Exclusion Principle. The single particle 
state that has a mean population of 1/2 has an energy equal to the imposed chemical 
potential. 




Figure 12.1 Fermi-Dirac occupation numbers for single particle states at energies e in contact 
with a reservoir at chemical potential p and temperature T. The variation with p at constant T is 
shown on the left, and the variation with T at constant p on the right. A mean occupation number 
of | prevails at an energy equal to the reservoir chemical potential. The occupation number varies 
between zero and unity over a range of a few kT about e = p. 
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On the right in Figure 12.1, we explore the dependence of (N} € on temperature at 
a constant chemical potential of p, = 107:77, where 77 is given in (11.22). The mean 
populations at T — 577, 77 and (1/5)77 illustrate that as the temperature is reduced, 
the state energy range over which the variation from unity to zero takes place becomes 
narrower. In the limit T -* 0, the occupation number takes the form of a step function, 
equal to unity for e < /x and zero for e > /i. This behaviour is analogous to the manner in 
which electrons occupy atomic energy levels up to a certain point, leaving higher energy 
levels empty. The electrons are prevented from congregating in the lowest energy atomic 
state by the Pauli Exclusion Principle, and the same rule applies for fermions in a box. 
The microscopic state of the fermion gas at T — 0 is thus quite different from the case 
of a boson gas, and this brings about distinctive nonclassical thermodynamic behaviour 
that we now investigate. 


12.3 Thermal Properties of a Fermion Gas 


The properties of the fermion gas are deduced by placing all the single particle states, 
described by the density of states in energy g(e) as given in (9.11), in contact with the 
same heat and particle reservoir. As in (11.9) the grand partition function is written as 
an integral: 

In Z G (ji, T) » j g(e) In Zq de = J g(e) In + exp j j d <7 (12.4) 


using Zq from (12.2), and the statistical properties of the gas then follow by differenti¬ 
ation. The mean total number of fermions in the system is 


(N) = kT 
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Jo 
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T,V 
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g(.€)(N) t de, (12.5) 


where (7V) C is given by (12.3), corresponding to the intuitive requirement that the mean 
number of particles in the gas is a sum of mean populations in all the single particle 
states. The particle density for spin s and mass m is then written as 


n (ii, T ) 


m 
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(2s + 1) / 2m \ 2 r°° € ide 
(2tt) 2 V h 2 ) J 0 exp (^) + 1' 


( 12 . 6 ) 


which differs from (11.15) only by the replacement of a minus sign by a plus sign in 
the denominator of the integrand. A nice way to remember this is that the minus sign is 
cut in half to make a plus sign, when the particles in question have a half integer spin. 

The thermodynamic properties of the fermion gas can be derived in just the same way 
that we employed for bosons in Section 11.3. In particular, we find that the mean energy 
may be written as 

7*00 

(E) = / g(e)e(7V>,de, (12.7) 

Jo 


such that the energy density is 

(E) (2 S + l)/2m\J r eide 

£(IX ’ ’ V (2 jt) 2 U 2 ; Jo exp(^)+l’ 


( 12 . 8 ) 
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Figure 12.2 Properties of a gas of spin half fermions as a function of chemical potential at a 
given temperature. Quantum deviations from classical properties e cl — (?>/2)nkT and p cl = nkT 
appear when n > n q {T). 


which again differs only slightly with respect to (11.17). The entropy of the gas is 
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in contrast to (11.20), and finally the pressure is given by 
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( 12 . 10 ) 


which is to be compared with (11.24). We can develop (12.10) to establish a link between 
the pressure and the energy density. We write 
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( 12 . 11 ) 


showing that relation (2.5) applies to the gas in both classical and quantum regimes. 

The classical limits of these expressions are the same as those of the boson gas 
obtained in Section 11.3. However, in the nonclassical regime, the deviations from clas¬ 
sical behaviour are rather different. We reiterate that unlike the boson gas, there is no 
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Figure 12.3 Entropy per particle s / n of a gas of spin half fermions against temperature, compared 
with the classical Sackur-Tetrode expression. 

apparent upper limit to the chemical potential, and therefore no reason for an abrupt 
change in behaviour analogous to Bose-Einstein condensation. As a function of particle 
density at a constant temperature, the properties of the gas are illustrated in Figure 12.2 
for spin half particles, which should be compared with those of the boson gas in Figure 
11.2. The key conclusions are that the mean energy and pressure exceed classical expec¬ 
tations when n > n (T). In Figure 12.3 we see that the entropy per particle tends to 
zero as the temperature is increased at constant density. In contrast, the Sackur-Tetrode 
expression s cl = nk In ((2s + 1 )e 5 ^ 2 n q /n) — nk In ((2 j + 1 )e 5 ^ 2 (T / T q ) 3 / 2 ) becomes neg¬ 
ative for sufficiently low temperature, which is obviously unphysical. 

By combining e (//, V , '/’(and n (/i, V , T) to give the energy density in terms of particle 
density, e(n, V , T), we can contrast the low temperature gas behaviour at constant density 
with the classical result e cl = (3/2)nkT in Figure 12.4. The gradient (de/dT) n v provides 
the heat capacity per unit volume, and clearly this is suppressed with respect to classical 
gas behaviour when T < T' . This result provides crucial experimental support for the 
approach, as it explains the missing heat capacity of electrons in metals. 


12.4 Maxwell-Boltzmann Statistics 


It will have been noticed that the wavefunction symmetry rules that apply to gases 
of bosons and fermions, and which are responsible for their distinctive nonclassical 
physical behaviour, give rise to expressions for the mean particle population in the 
single particle states that differ only slightly. These are the key results of Bose-Einstein 
and Fermi-Dirac statistics, respectively. The mean population is 


1 



( 12 . 12 ) 


where the minus sign applies to bosons and the plus sign to fermions. We have already 
explored the classical limit of the properties of these gases, starting with (11.16) for 
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Figure 12.4 Energy per particle e/n of a gas of spin half fermions against temperature, compared 
with the classical behaviour. As T —> 0, the gradient of the curve, and hence the heat capacity at 
constant volume, goes to zero. 

bosons in Section 11.3, and continuing with fermions in Section 12.3. Both forms of 
(12.12) coalesce into a classical version that goes by the name of Maxwell-Boltzmann 
statistics, and we shall use this to derive a connection between the exact formulation of 
the grand partition function of ideal quantum gases, presented here and in Chapter 11, 
and the approximate canonical partition function constructed for classical conditions in 
Section 9.5. 

Recall that the classical limit corresponds to conditions where the probability of occu¬ 
pancy of single particle states is very small, namely (N) f 1. This is equivalent to 
(e — p)/kT 1, implying that the chemical potential of the reservoir should lie below 
the lowest energy single particle state by a considerable multiple of kT. There are indi¬ 
cations in Figures 11.1 and 12.1 that when p is very negative, the occupation number of 
all states is small. Figure 12.5 is a sketch of the chemical potential of boson and fermion 



Figure 1 2.5 Chemical potential of gases of spin zero bosons and spin half fermions as a function 
of temperature. 
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gases as a function of temperature, at constant particle density, that illustrates the same 
point. It is a generalisation of Figure 2.9, but with low temperature limits included on 
the basis of (11.15) and (12.6). Clearly a very negative chemical potential corresponds 
to high temperatures. 

In these circumstances, both forms of the occupation number go over to 

{N) € & exp > (12-13) 


which is known as Maxwell-Boltzmann statistics. The distinction between boson and 
fermion is lost, because if there is a low probability of occupancy of single particle states, 
the implications of wavefunction symmetry with regard to multiple occupancy are not 
important. Maxwell-Boltzmann statistics as a function of single particle state energy e 
are illustrated in Figure 12.6, alongside Bose-Einstein and Fermi-Dirac statistics for 
the same chemical potential and temperature. 

The convergence of the expressions for the grand partition function of boson and 
fermion gases in the classical limit can also be explored. We have 
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(12.15) 

(12.16) 



Figure 12.6 Maxwell-Boltzmann statistics, indicating the mean occupation number of states of 
energy e in contact with a reservoir at chemical potential p, (chosen in this example to be zero) 
and temperature T, valid for particles of any spin when the occupancy is small. The more general 
Fermi-Dirac and Bose-Einstein statistics are shown for comparison. 
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We note that the right hand side is proportional to the one-particle canonical partition 
function for the gas derived in Section 9.3. Thus lnZ G = sxp(p/kT)Z l or equivalently 



(12.17) 


But recalling from (7.9) that Z g = EoZ n exp(N p/kT), we can read off the canonical 
partition function for the classical gas of N particles, namely 



(12.18) 


just as we constructed in (9.19). 

This result underlines the point that the development of gas properties using the grand 
partition function automatically takes indistinguishability into account: it has reproduced 
the 1 /TV ! correction factor for a classical gas. This is due to its foundation upon pop¬ 
ulations of single particle states and not on the circumstances of individually labelled 
particles. 

12.5 The Degenerate Fermion Gas 

As noted in Section 10.3, we have ready access to fermion gases that exceed their 
quantum concentration, and so it is to the properties of such gases in the condition 
n n q that we turn next. 

In the extremes of high density or low temperature, the single particle states of the 
fermion gas system become occupied with probability unity up to a maximum single 
particle state energy, and states above this level are empty. In this condition, the fermion 
gas is described as degenerate. The meaning of the word in this context is that the 
particles have sunk into the lowest available energy states. Note that this is distinct from 
the use of ‘degeneracy’ to indicate the number of quantum states of a system at the 
same energy. Degenerate in the sense of ‘debased’ is the intended meaning. The entropy 
of the gas is zero, as the microstate assumed by the system is known with certainty. 
However, the gas still possesses energy and exerts a pressure, and in this section we 
shall determine these properties. 

For the degenerate gas, we replace the Fermi-Dirac function in (12.5) with a step 
function, the T -* 0 limit alluded to in Section 12.2, such that 



(12.19) 


where the highest occupied single particle state has an energy e F , known as the Fermi 
energy. It is equal to the limit of the chemical potential p of the fermion gas at zero 
temperature, shown in Figure 12.5, as this defines the energy of the state that is half 
filled. Inserting the usual density of states, we get 



( 12 . 20 ) 
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and if we specify a spin of one half, we obtain a mean particle density 

(N) 2 (2m \ 2 2e p 2 

” ~~ ~V~ ~ (2tz) 2 \W) 3 ’ 

or 

H 2 o 2 
e F = —(3:t-n)3. 

1m 


( 12 . 21 ) 

( 12 . 22 ) 


The Fermi energy is related to the temperature below which quantum effects become 
substantial. We define the Fermi temperature T v = e p /k and then write 


h 2 




2 2ivnkT 


h 2 



(12.23) 


Hence the condition for degeneracy n n is equivalent to T T v , and indeed the 
expression for the Fermi energy has a form very similar to that of the Bose-Einstein 
condensation temperature T c defined in (11.29) and the temperature T given in (11.22); 
in fact T f = (3ji 1/2 /8) 2/3 7^ « 0.16T q . 

Similarly, we can determine the total energy of the degenerate fermion gas as 


(E) 


(2s + 1)V 

(2tc) 2 



(12.24) 


and together with (12.21) this implies 

(E) = 2e 5 P /2 /5 = 3e F 
(N) ~ 2e 3 P /2 /3 ~ 5 ’ 


(12.25) 


such that the mean energy per fermion is 3e F /5. The pressure of the degenerate gas is 
p = 2(E)/(3V) according to (12.11), so 


P = 


2 (E) 2 

- n = -ne v , 

3 (N) 5 F 


(12.26) 


and a sketch of its temperature dependence is given in Figure 12.7. The pressure does 
not go to zero as T —> 0. Let us now apply these results to electrons in metals, the prime 
example of a degenerate fermion gas. 


12.6 Electron Gas in Metals 

In a metal, the electrons in the outermost shell of the constituent atoms are only loosely 
bound, and are able to wander around the material leaving behind a positively charged 
ion. These electrons can be roughly considered to form a gas of noninteracting particles 
of spin 1/2: the free electron or Drude model mentioned in Section 10.3. We ignore the 
fact that charged electrons ought to interact strongly with one another and with the ions; 
these forces can be neglected to a first approximation because of electrostatic screening 
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Figure 12.7 Pressure of a gas of spin half fermions at constant density as a function of tem¬ 
perature, compared with the classical ideal gas law. As T —*■ 0 the pressure tends towards the 
degeneracy pressure p — (2/5 )ne F . 


effects. The ability of the electrons to move around explains why metals are able to 
conduct electricity and heat so well. 

The concentration of conduction electrons n is about 10 28 m \ and so from (12.22) 
the Fermi temperature T v = c v /k is of order 10 4 K. Thus at room temperature, T !\ 
and the electron gas is highly degenerate. Almost all electrons are in states from which 
no thermally driven transitions into empty states can easily be made: they are locked 
into their lattice points in k-space. Nevertheless, the electrons are free to drift in a 
collective manner under the influence of an electric field, and hence are able to conduct 
electricity. 

The mean energy per electron is (3/5)e F , equal to about 10 eV for the conditions given. 
This should be compared with the expected classical energy (3/2 )kT of around 25meV 
at room temperature, and so the electrons are typically moving much more rapidly than 
particles of a classical gas at the same temperature. The pressure of the gas is (2/5 )ne F 
which is approximately equal to the classical pressure of the gas at the Fermi temperature, 
namely 10 9 Pa or 10 4 atmospheres. This enormous electron pressure makes an important 
repulsive contribution to the cohesion of metals, balancing the electrostatic attraction 
between ions. While the energy and pressure of the gas are considerably greater than 
values expected of a classical gas at the same density, the entropy can be taken to be 
approximately zero. All these features are apparent from Figures 12.3, 12.4 and 12.7. 

The typical electronic energy is nevertheless well below its rest mass energy of about 
511 keV. This confirms that there is no inconsistency in using a nonrelativistic expression 
for the energy of an electron e = h 2 k 2 /(2m e ), where m e is the mass of the electron, in 
the derivation of the density of states g(e ) on which the model is built. However, if 
we imagined increasing the density of an electron gas, such an inconsistency would 
ultimately emerge since the Fermi energy and the mean energy per particle increase 
in proportion to « 2 / 3 . In the next section we consider a situation where a relativistic 
treatment of the electrons would be more appropriate, and we shall see that such an 
extreme situation plays a role in the loss of stability of stars. 
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12.7 White Dwarfs and the Chandrasekhar Limit 


The properties of the fermion gas allow us to understand the mechanical stability, or lack 
of it, of elderly stars. Leaving aside much of the complex physics of such systems, the 
issue can be understood as a balance between gravitational collapse and the resistance 
to compression of the gaseous core of the star. 

Let us approach the question of stability in the following simplified way. The pressure 
at a depth z below the surface of the sea is given by p w gz , where p w is the mass density 
of water and g is the gravitational acceleration, corresponding to an increase of about 
one atmosphere for every ten metres descent and where we have neglected atmospheric 
pressure. Let us estimate the pressure at the centre of a star of mass M s in a similar way. 
Replacing g by the local acceleration towards the stellar centre according to Newtonian 
gravitation, namely GM(r)/r 2 , where G is the gravitational constant, r is the distance 
to the centre of the star and M (r) is the mass of the star contained within the radius r, 
we readily find that the difference in pressure between the centre of the star at r = 0 
and its surface at r = R is 

f R GM (r) 

p(0) - p(R) = Ap = -f^P s (r)dr, (12.27) 

Jo r z 

where p s is the mass density of stellar material. We wish to obtain a rough estimate of A p 
and therefore assume the density to be a constant; then we can write M{r) = M s (r/R) 3 . 
Thus A p ~ Gp s M s R ~ 3 j^ rdr = Gp s M s /(2R). Inserting p s — 3M s /(4jtR 3 ), we find the 
pressure at the centre of a star of mass M s and radius R to be 


3 GM 2 
Pg ~ 8jtR 4 ’ 


(12.28) 


since p{R) ~ 0. Applying this to the sun with R ~ 7 x 10 8 m and M s ~ 2 x 10 30 kg, we 
estimate the core pressure due to gravitational attraction to be 10 14 Pa, using G ~ 6.7 x 
10 11 m 3 kg _1 s -2 . Better models of the sun might be more accurate but we are interested 
only in the rough dependence of the core pressure on the mass and radius of the star. 

We assume that the gravitational pressure is balanced by the pressure of the hot gases 
of electrons and nuclei that are to be found in the centre of a star. Presuming that most 
of the mass of the star is carried by hydrogen nuclei, and continuing to assume that the 
stellar density is constant throughout the star, we estimate that the particle density in the 
star is n s ~ 3(M s /m p ) / (4nR 3 ), where m p is the proton mass. For the sun, this is about 
10 30 m 3 . and there will be roughly equal numbers of electrons and protons. Assuming 
each species has the classical thermodynamic properties of an ideal gas, the pressure at 
the core would be p s = n s kT s ~ 10 7 7) Pa. In order for this to balance the gravitational 
pressure, the core temperature T s needs to be of order 10 7 K, and this is maintained 
by the process of nuclear fusion. The mean energy per particle is (3/2 )kT ~ 1 keV, 
which is nonrelativistic, as it is much less than either the electron mass (511keV) or 
the proton mass (938 MeV). Furthermore, the quantum concentration for the lightest 
available particle species, the electron, is 


(Inni-kT \ 2 

( ~h 2 ) 


10 32 m~ 3 . 


(12.29) 
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and this is safely higher than the estimated particle density n s . The mechanical balance 
at the centre of the star is quite adequately explained using the properties of classical 
ideal gases. 

But when a star begins to run out of fusable nuclei, the supply of heat to maintain 
the core temperature and pressure fails, and the star begins to change. The sequence of 
events is rather complex, but we shall address a particular issue: what pressure might 
ultimately resist gravitational collapse if the core begins to cool? The quantum properties 
of the electron gas play a crucial role. As the core contracts, the particle density increases 
in proportion to R 3 . Without estimating how the electronic quantum concentration n q 
might change as the temperature falls, it is clear that a reduction in stellar radius will 
raise the density of electrons in the star until it enters the regime where degenerate 
gas properties take over from classical behaviour. Under these circumstances, the gas 
pressure is given by p s (2/5)« s e F oc n s using (12.22). Thus the resistance to com¬ 
pression increases, until it can balance gravitational forces, in which case an equilibrium 
is established at a radius R d given by 


P s 
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5 2m e 



h?_ 

5 m e 


(3tt 2 )3 
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(12.30) 


which leads to 


R, 


2 /9tt 

5\T 


h 2 

1 5 

GM 3 nip m e 


(12.31) 


If we insert values of the parameters, then the equilibrium radius for a stellar mass 
equal to that of the sun is 7 x 10 6 m. This is about 1% of the present radius of the sun! 
Consequently, the particle density n s in this star is about six orders of magnitude higher 
than that of the sun, or about 10 36 m 3 . 

Considering its quantum concentration estimated in (12.29), the electron gas in such 
a star is degenerate, as required. The gases of nuclei, on the other hand, have a quantum 
concentration that is approximately ( m p /m e ) 3 / 2 ~ 10 5 higher, or about 10 37 m 3 , and 
they remain approximately classical in behaviour. The burden of resisting gravity is 
carried by the electron gas alone. From a consideration of (12.28) the pressure at the 
centre of the star is eight orders of magnitude higher than at the centre of the sun, 
reaching the extraordinary value of 10 22 Pa or 10 17 atmospheres. The density of the 
star p s is 3;V/ s / (4it A^ ) or about 10 9 kgm about a million times denser than ordinary 
terrestrial matter. 

Such a star is stable since the resistance to compression p s oc R 5 increases more 
rapidly than the gravitational pressure p G oc R 4 if the radius should fall slightly below 
R d . The resistance derives from the Pauli Exclusion Principle and the requirement that 
electrons should avoid one another for reasons of wavefunction asymmetry, a matter 
alluded to in question 10.2. It is a mighty consequence of quantum statistical thermody¬ 
namics. 
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Figure 1 2.8 A star of mass M s and radius R that has exhausted its supply of fuel is stabilised by 
a balance between compressive pressure p G brought about by gravity, and resistive pressure p & of 
components of the stellar material. The electron gas becomes degenerate before the nucleon gas 
(formed from protons and ultimately neutrons) since the quantum concentration is proportional 
to particle mass. The electron pressure (black line) then rises rapidly on compression until it 
becomes equal to p G (blue line) at the radius R d , stabilising a white dwarf. A star that exceeds 
the Chandrasekhar mass M ch , however, exerts too great a gravitational pressure to be balanced by 
this mechanism, since the electron gas softens at great compressions because of relativistic effects. 
For such cases, a balance between gravitational pressure (red line) and the pressure of the neutron 
gas (dashed line) in a nonrelativistic degenerate condition is possible at a radius R n , producing an 
extremely highly compressed neutron star. 


The star remains hot enough to radiate for some considerable time, indeed of the order 
of the age of the universe, but with little fusion going on at its centre, its final fate is 
to become a cold, stable stellar remnant. The balance between pressures is illustrated in 
Figure 12.8. 

This sequence of events explains very well the existence of a peculiar class of stars, 
the white dwarfs. They are extraordinarily small in size, and emit dimly but with a white 
colouration. These features match very well the properties of the stellar remnant we 
have just described. It is estimated that almost all stars will eventually become white 
dwarfs after they exhaust their fuel supply. As they cool down and emit less strongly, 
they would evolve into remnants called black dwarfs, for obvious reasons. 

But there is a flaw in this story, and it involves the assumption that the electrons in 
the degenerate core of the star are characterised by nonrelativistic energies. This gave 
us the density of single particle states g(e) oc f 1/2 in (9.11) and then the degeneracy 
pressure (12.26) that we employed in the balance relation (12.30). So it would be an 
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important check of self-consistency to make sure that the mean energy per electron in the 
star did not exceed the rest mass energy m e c 2 , which is the criterion for nonrelativistic 
behaviour. We write 



3 h 2 
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(3jt 2 « s )i 


5 2m/ \4izm p l 4 \9nJ 
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(12.32) 


and neglecting all numerical factors, this reduces to 


< — ,1 — 


ch\ 2 


(12.33) 


The numerical value of the right hand side of this inequality is of the order of 10 30 kg, 
in other words, around the mass of the sun! A more careful analysis suggests that the 
upper limit on the mass of a star that could support itself as a white dwarf according to 
the model described above is about 2.9 x 10 30 kg, or about 1.44 times the solar mass. 

This limitation of the model is rather serious. If we were to repeat the analysis of the 

fermion gas but this time assume that the energies of the single particle states were highly 

relativistic, such that the relationship between single particle energy and wavevector were 

€ = hck instead of e — h 2 k 2 / (2m), then we would obtain the density of states g(e) oc c 2 

instead of g(c) oc e 1 ^ 2 and Fermi energy e F oc /' 3 rather than e F oc r/ 3 . The resistance 

4/3 

to gravitational collapse would now depend on particle density according to p s oc n s 
rather than p s oc «/ , and on stellar radius according to p s oc R~ 4 instead of p s oc R 2 . 
The degenerate electron gas pressure will increase more slowly as the star is crushed 
when the particles are relativistic in energy, and in fact it increases at the same rate as 
the gravitational burden p G oc R 4 . The softening of the electron gas is illustrated in 
Figure 12.8. This is a crucial point: there is then no stable radius at which the opposing 
pressures will balance. 

We conclude that any star with M s > M ch ~ 1.44M Q , where M 0 is the mass of the 
sun, would have to look elsewhere for support against gravitational collapse after its 
nuclear fuel runs out. M ch is known as the Chandrasekhar limit, after Subrahmanyan 
Chandrasekhar (1910-1995). But there are other possibilities to fall back on. An alter¬ 
native support mechanism is considered in the next section, and with it a new class of 
stellar remnant. 


12.8 Neutron Stars 

If the star collapses so far that the electron gas has turned relativistic and soft, there is 
still a possibility that a balance can be established between gravitational pressure and 
the degeneracy pressure of the nuclei inside the star. These are mostly protons, which 
are also fermions, and as they are 1800 times more massive than electrons, they remain 
nonrelativistic at stellar densities beyond that to be found at the Chandrasekhar limit. 
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We might imagine that a balance between forces is achieved at a stellar radius 
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(12.34) 


which is just the equation for the dwarf radius R d in (12.31) with the electron mass 
replaced by the proton mass. This is illustrated in Figure 12.8. The radius is 1800 times 
smaller than the dwarf radius we estimated earlier, or around 4 km. Remember that this 
contains 10 3 °kg of material! The mass density is ten orders of magnitude greater than 
that of the already crushed white dwarf, or about 10 19 kgm -3 , and the particle density 
is 10 47 m such that the mean particle separation is of order 0.1 fm. An estimate of the 
pressure would be 10 35 Pa! The entropy, on the other hand, would be essentially zero. 

At such an enormous pressure, it becomes favourable for protons to combine with 
electrons to become neutrons, which are also fermions, and emit a neutrino. The star is 
converted into a gravitationally bound gas of neutrons at a density similar to that of the 
atomic nucleus. Such a stellar remnant is seriously astonishing. But is there evidence that 
they exist? Extraordinarily, the answer is yes: pulsars, extremely compact astronomical 
objects that emit pulses of radio emissions, are thought to be examples of these so-called 
neutron stars. Possessing a mass beyond the Chandrasekhar limit, these objects can be 
stabilised only by the degeneracy pressure of neutrons. 

But immediately we can see that there is an upper limit to the mass of a neutron 
star. If the neutron gas goes relativistic, it will in turn become too soft to carry the 
burden of the gravitational pressure. What happens then is not clear. Ideally, the star 
would look for a similar way out, and try to form a fermion gas made up of still 
heavier particles from the stupendously dense nuclear material. We simply do not know 
what could happen, but we have an idea of another fate that might befall the star if its 
collapse continues too far. It might disappear behind its event horizon; or more exactly 
its radius R might fall below the Schwarzschild radius R s = 2GMJc 2 , which is about 
1 km for the parameters considered. After this point gravity at the surface of the stellar 
remnant becomes so strong that even light is trapped, and the elderly star turns into a 
black hole. 


12.9 Entropy of a Black Hole 


This is not the place to discuss black holes in detail, as it would require the mathematics 
of general relativity, but their thermodynamics are very interesting. Black holes possess 
entropy, and a great deal of it. 

Various studies have suggested that the entropy of a black hole can be written as 


kc 3 A 
= ~4GH’ 


(12.35) 


where A, = 4tt/?^ is the area of the event horizon of the black hole, defined as the surface 
of a sphere at the Schwarzschild radius. The label BH can stand for either black hole 
or for the surnames of Jacob Bekenstein (1947-) and Stephen Hawking (1942-), the 
main developers of the ideas. The entropy per neutron (presuming that this is the right 
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particle to use) of the hole is S m /(M s /m p ) = 4-xkGM s m p /(ch ) = 10 20 A:. In common 
with many other aspects of the physics of stellar remnants, this is a huge number, but it 
is also seriously confusing. 

One puzzle arises from the so-called no-hair theorem of black holes, which suggests 
that a black hole has only a handful of measurable properties, such as mass, angular 
momentum and charge. How can such an apparently simple physical object possess so 
much internal uncertainty? The answer is that these gross properties are the macroscopic 
state variables; all the microscopic detail that was uncertain but in principle measur¬ 
able just before the star disappeared behind its event horizon is still there, but is now 
definitively hidden from view. 

A more serious puzzle is the following. The neutron material just before the disap¬ 
pearance of the remnant behind the event horizon was an extremely degenerate fermion 
gas, and we might have been under the impression that its entropy per particle was 
approximately zero. Where has the black hole entropy come from? 

This is where the topic becomes speculative. The Bekenstein-Hawking entropy, if 
it is indeed a fundamental property of the object behind the event horizon, must be a 
fine grain entropy related to the number of microstates of the components of whatever 
the object is made of. We discussed coarse and fine grain entropy in Section 8.4: at 
the coarse grain level of the neutron gas the entropy is zero. The significance of the 
black hole entropy is that it might be giving us a count of arrangements at the very 
lowest level of the universe; way below atoms and quarks. Somehow these features are 
implicit in having used general relativity to study black hole behaviour and deduce its 
thermodynamic properties. Models of black holes consistent with general relativity but 
based on current variants of string theory appear to offer such a vast multiplicity of 
microstates, with their associated entropy. It could be that the thermodynamics of black 
holes provides a perspective on the ultimate level of physical reality, although it is a 
little too soon to be absolutely sure! 

Exercises 

12.1 Three fermions, all with the same spin orientation, are held in a harmonic potential 
that, with the neglect of zero point energy, has single particle states at energies 
€ n = nhco, where n = 0, 1, 2 and so on. Indicate the particle arrangement and 
energy of the four lowest energy microstates of the system if the particles are (a) 
distinguishable and (b) indistinguishable. Assuming the canonical partition func¬ 
tions for each case may be approximated at low temperature as a sum over these 
four microstates, calculate the Helmholtz free energies of each system in terms of 
x = exp (—hco/kT). Also calculate the mean energies. 

12.2 Show that the chemical potential of a gas of nonrelativistic electrons at a temper¬ 
ature much less than the Fermi temperature is proportional to n 2/3 , where n is the 
particle density. 

12.3 For a degenerate gas of ultrarelativistic electrons in volume V, the density of states 
g(E ) is proportional to VE 2 . Show that the Fermi energy is proportional to n 1 ;/2 . 
Hence show that p oc ;; 4,/3 for an ultrarelativistic degenerate electron gas. 

12.4 Derive the Fermi energy e F of a gas of N electrons confined to a volume V at 
zero temperature and show that the mean energy per electron in such conditions 
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is 3f|./5. The relationship between the mean energy E and the entropy .S' of a low 
temperature electron gas is 


E = 



Show that the relationship between temperature and entropy for this system is given 
by T — 2e F S /(n 2 k 2 N). Does the gas satisfy the third law of thermodynamics? 
Express E and S in terms of T to show that the Helmholtz free energy is given by 


F = 



N(nkT) 2 

4e F 


Hence show that the pressure of the gas is 


P = 


2NkT F 
5V 



5tt 2 / T 

^2\V f 
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where T F is the Fermi temperature. In a particular metal, the Fermi temperature is 
57000 K. Calculate the entropy per electron at T = 300 K. 

12.5 A system consists of two levels with energies 0 and e, respectively. Calculate the 
canonical partition function Z, of the system when it accommodates one parti¬ 
cle and is in contact with a heat bath at temperature T. Express your result in 
terms of the parameter y = exp(— e/kT). Calculate the mean energy of the parti¬ 
cle. Calculate the canonical partition function Z 2 when the system accommodates 
two indistinguishable fermions, if each level can occupy one particle at most. The 
system is exposed to an environment that is a source of indistinguishable fermions 
at a chemical potential //, and heat at a temperature T. Determine the mean number 
of particles in the system ( N }, in terms of y and w — expC/x/CT). 

12.6 An astronomical object is discovered with an apparent radius of 2.5 km. Assuming it 
is a stabilised stellar remnant with a mass of that of the sun, deduce the approximate 
mass of the fermions within it, and their likely nature, assuming there is only one 
species present. 
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Photon Gas 


In this chapter we shall consider the statistical thermodynamics of light. We consider 
the dynamics of electric and magnetic fields inside a box, and then couple their standing 
wave modes to a heat bath such that they acquire a canonical equilibrium distribution 
of energy. Then thermodynamic properties such as energy density, pressure and entropy 
are derived, and used to explain various aspects of black-body or cavity radiation. It was 
the resolution of some of these puzzles, particularly by Max Planck (1858-1947), that 
set off the quantum revolution at the beginning of the twentieth century. 

We shall find that electromagnetic radiation behaves rather like a gas of particles, 
involving quanta of the electromagnetic field known as photons. They have zero mass, 
move at the speed of light and do not interact with each other. On the other hand, they 
interact strongly with electrically charged matter, to a degree that the walls of a container 
can be considered to act as a particle reservoir for photons, but at a rather unexpected 
chemical potential. 


13.1 Electromagnetic Waves in a Box 


We begin this ambitious programme by establishing the classical standing wave modes 
of electric and magnetic fields inside an evacuated box. The wave equation controls their 
evolution, for example 


9 2 E 

lir 2 " 


c 2 V 2 E, 


(13.1) 


where E is the electric field and c is the speed of light. Assuming that the box is a 
cube of side length / and the walls are good conductors, we impose boundary conditions 
such as E y (x = 0 ,t) — E y (x = l,t) = 0 together with a similar constraint on the z- 
component. This means we expect a standing wave mode along the x-axis with transverse 
polarisation in the y direction given by E y (x,t) = E y0 sin^x) sin(®f) where E y0 is 
the maximum electric field amplitude, k x = n x ti/1 with n x a non-negative integer, and 
at — ck x . There are standing wave modes in the other two Cartesian directions, involving 
the respective components of wavevector k and k z , and each has two directions of 
transverse polarisation. 
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This specification of the independent oscillatory modes of the electric field is very 
reminiscent of the wavefunction solutions to the Schrodinger equation describing a sin¬ 
gle particle in a box, considered in Section 9.1. It is important to bear in mind that 
these modes are classical field oscillations, and not wavefunctions, but nevertheless we 
shall carry over many of the concepts. Using an argument similar to that employed in 
Section 9.2, we can deduce that the number of standing wave modes with a magnitude 
of wavevector in the range k —> k + dk is 

Vk 2 

p(k)dk = 2 — T d k, (13.2) 

2n z 


in terms of a density of states p(k). Notice that instead of the factor of (2s + 1) in (9.9) 
that denotes the number of spin orientations of a spin s particle, we have a factor of two 
that counts the number of transverse polarisations associated with a standing wave with 
a given wavevector. 

The corresponding magnetic field must satisfy Maxwell’s equations of electromag¬ 
netism, particularly 

3B 

— = -V x E, (13.3) 

3r 


and so dB z /dt = — dE y /dx = — k x E y0 cos(k x x) sin(cuf) giving B, (pc, t) = B z0 cos(k x x) 
cos(cuf) with B z0 = k x E y0 /w = E y0 /c, with similar specifications of a magnetic mode 
to accompany each of the other five electric field modes. The pattern of the fields is 
illustrated in Figure 13.1. 

It is worth making explicit the correspondence between this behaviour and that of a 
mechanical oscillator. The energy E of the specified field mode, not to be confused with 
the amplitude of electric field E y , is given by 



e o F 2 

j E y 


2p 0 


B: dV 


dr 


^ E 2 0 sin 2 k x x sin 2 cot + 


2po 


ZT 2 0 cos 2 k r x cos 2 cot 




(13.4) 



Figure 13.1 The green waveform represents a standing wave mode of the electric field in a 
cavity, and the yellow represents the magnetic field. Nodes of the electric field lie at the cavity 
walls. The electric and magnetic modes oscillate n/2 out of phase. There are similar modes 
extending in the y and z directions. 
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using c 2 = l/(e 0 /x 0 ), where e 0 and pc Q are the permittivity and permeability of 
vacuum. A mechanical oscillator with mass m and harmonic spring constant k evolves 
according to dp/At — —nq and dq /dr = p/m, where q is the displacement and p the 
momentum, such that p = p 0 cosoot and q — q 0 smo>t where co = (ic/m) 1 ^ 2 , and with 
amplitudes related by p 0 = q 0 ic/co, giving an energy E = (l/2)icq 2 + (l/2m)p 2 = 
(l/2)Kq^jsin 2 cot + ( \/2m)pQCOs 2 cot = Kq^/2, which is similar in form to (13.4). As a 
result of this correspondence, we conclude that the electric and magnetic fields roughly 
correspond to the displacement and momentum of the oscillator. 

This strongly suggests that the standing wave mode should be treated in quantum 
mechanics in just the same way as a mechanical oscillator. Its energy cannot take arbi¬ 
trary values, but only the quantised values ( n + 1/2 )hoo, where n is a non-negative 
integer. The quantised electromagnetic field in the box is physically equivalent to a set 
of harmonic oscillators with specified frequencies. We can therefore use what we know 
about the statistical thermodynamics of oscillators to establish the thermal properties of 
electromagnetic radiation. 

13.2 Partition Function of the Electromagnetic Field 

We now construct the canonical partition function of the electromagnetic field. We have 
a set of oscillatory modes defined by a wavevector k and angular frequency co. The 
density of states in k allows us to obtain the corresponding density of states in co using 
the dispersion relation co = ck. We write 



(13.5) 


Each standing wave mode of the electromagnetic field is in canonical equilibrium with 
a heat bath at temperature T. The analysis in Section 6.4 then gives us the canonical 
partition function of the mode: 


1 



(13.6) 


The total partition function Z is the product of over all modes, and so 



(13.7) 


since g(co)dco counts the number of modes in the frequency range co to co + d co. Inserting 
(11.30), we get 
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and the mean energy of the system is 


(E) = 


3 In z 


-f 


’ , ainZ 

g(co) _ d co = 


dp 


-f 


h(D\ {N) m + - )g(co)dco = 


-L 

hv r c 

3 2 / 

c 5 Tt- Jo 


hcog(a>)dw 

f hcofi \ 


0 2 tanh 


(i3- 9 ) 


where we have inserted the identity coth(x/2) = 2(exp(x) — 1) 1 + 1 and defined 

1 


(■ N > w = 


exp(hco/3) — 1 


(13.10) 


But it should be noticed that there is a serious problem with the mean energy in (13.9): it 
is infinite! The trouble comes from the term that is proportional to / 0 °° ax'da). We cannot 
work with this. 

We can readily identify the source of the problem and find a way round. The expression 
for the mean energy in (13.9) clearly shows that the infinite term is the sum of zero 
point energies of the quantum oscillators. The problem can therefore be overcome if 
we set a limit on the number of oscillators that represent the fields in the box. In 
addition, this is physically sensible: it is unreasonable to suggest that field modes should 
exist for arbitrarily high frequency and wavevector, or equivalently for arbitrarily small 
wavelengths. We do not know whether Maxwell’s equations apply for length scales that 
are arbitrarily small. 

So we cut off the integral at some high but finite frequency, such that the mean energy 
is no longer infinite. Furthermore, we take the view that the large but constant zero point 
energy of the field is not of interest to us in statistical thermodynamics, and simply 
renormalise the energy scale by subtracting it away. We shall therefore work with 


(E) 



fto){N) co g(tti)dco = 


hv r°° 

c 3 rt 2 Jo 


or 


exp(hwfi) — 1 


dco. 


(13.11) 


or equivalently with 


InZ 
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g(a>) lnZ*d<y 


-f 


g(tv) In 
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1 — exp(— hu>P) 


d co, 


(13.12) 


which is based on the partition function Z* of an oscillator for which the zero 
point energy has been ignored. Its energy levels are given by n ho>, and = 
TZ=o ex P(- nHc °P) = (1 — exp(—?i&>/6)) , which is to be contrasted with the form 

derived in (6.23) and used in (13.8). These quantities satisfy (E) = —3 In Z/9/i as they 
should. 

The quantity (N) m is the mean number of quanta in the mode at frequency co. Now we 
arrive at a point where we place a physical interpretation on the model, and in particular 
on (13.11). The standing wave modes provide an analogue of the single particle states 
of particles in a box discussed extensively in the last two chapters. The mean energy 
of the field (13.11) is similar in form to the mean energies of boson and fermion gases 
in (11.14) and (12.7). It seems that the electromagnetic fields in the box, coupled to a 
heat bath, behave rather like a gas of particles that can occupy single particle states at 
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energies hw, with the density of mode frequencies g (co) playing the role of the density of 
single particle states. The number of quanta in each mode is the analogue of the number 
of particles in the state, and we can define 

n oo n oo 

(N)= {N) a g((o)dco = - —— - -g(co) do, (13.13) 

Jo Jo exp (hco/3) - 1 

to be the total number of these particles in the box. 

Furthermore, the dispersion relation co = ck implies that these particles satisfy a linear 
relationship between energy and wavevector e = chk, suggesting that their rest mass m 0 
is zero, since the general relationship is c 2 — mxc 4 + p 2 c 2 and the momentum of a freely 
propagating particle is p — hk. A rest mass of zero implies that the particles travel at the 
speed of light, and the negative sign in the denominator of (N) (a suggests that they behave 
rather like bosons, with integer spin. These particles are the photons that were mentioned 
at the beginning of this chapter. This interpretation is an extremely important conceptual 
leap. We explore the idea of photons, and determine thermodynamic properties of the 
electromagnetic field, in the next section. 


13.3 Thermal Properties of a Photon Gas 

13.3.1 Planck Energy Spectrum of Black-Body Radiation 

The properties of electromagnetic radiation inside cavities, determined by experiment in 
the later decades of the nineteenth century, provided the first indications that classical 
physics could not account for the workings of the world. Planck introduced the quantum 
view by suggesting that the energies of cavity fields were quantised, and obtained (13.11), 
which we write as 

/»oo 

(E) = V u(co)dco, (13.14) 

Jo 

in terms of u(co), an energy density in frequency, per unit spatial volume. 

This density could be determined by measuring a frequency spectrum of the intensity 
of radiation emerging from a small hole in a cavity, and regarding it as a sample of the 
equilibrium counter-propagating radiation to be found inside. The walls of the cavity are 
assumed to emit such radiation to balance the absorption of radiation from an opposing 
surface. The walls are regarded as the reservoir that supplies the modes with energy, 
through coupling of electromagnetic fields with charged matter. For ideal coupling, there 
is no reflection, such that all the incident radiation is absorbed before being re-emitted. 
Roughly speaking, absorbing walls are black, and the emission from them is called 
black-body radiation. It is a slight misnomer though, as a hot wall can emit very bright 
radiation, and not appear black in the least! 

The point made by Planck was that the inferred experimental energy spectrum u(co) 
matched the form suggested by (13.11), which we now know as the Planck spectrum 

h co 3 


u(co) = 


c 3 tt 2 (exp (j^) - 1)' 


(13.15) 
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Figure 1 3.2 Sketch of the Planck energy spectrum of black-body radiation at a low temperature 
T c (blue line) and a high temperature T h (red line). The frequency to max where the spectrum reaches 
a peak increases linearly with temperature (Wien’s law) and the area under the curve, illustrated for 
the case T = T c , is the total radiant energy and increases with the fourth power of the temperature 
(Stefan’s law). 


where the temperature of black-body radiation is that of the walls with which it couples. 
Examples at a high and low temperature are shown in Figure 13.2. The spectrum has a 
peak that corresponds to the condition 


d u 

d&> 


h / 3o> 2 ax’ h 

c 3 tt 2 \(exp(£f!)-l) (exp(j^)-l) 2 * r \* r // “ 


and hence the peak frequency a> 
The numerical solution is 


max is given by 3(e A — 1) = xs x , where x = Hco 
kT 

"max % 2. 821 —, 


(13.16) 

ma x/kT. 

(13.17) 


and the peak frequency is therefore proportional to temperature, with the proportionality 
constant providing a value for the Planck constant. The experimental observation that 
the peak in the energy spectrum of black-body radiation scales linearly with temperature 
is known as Wien’s law. 

The Planck spectrum can be interpreted as a density of photons in frequency, per unit 
volume, with the form 


1 

«(") = y{N) al g(a>) 


c 3 tt 2 (exp(j^)-l)' 


(13.18) 


The mean population of photons in a mode (N) m = [exp (Hco/kT) — l] 1 has a form 
so similar to the Bose-Einstein statistics expression (11.3), when we regard ha> as the 
analogue of the energy of a single particle state, that we can deduce two important 
properties of photons, namely that they are bosons, as already noted, and that they have 
zero chemical potential. 

This might sound peculiar, but the fact is that photons differ from particles of matter 
in that their total population does not satisfy a conservation condition. There are no 
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photons in the environment coupled to our system and their population in the box can 
be increased simply by increasing the temperature of the walls. The chemical potential 
was developed in our discussion of the grand canonical ensemble in Chapter 7 on the 
basis of particle conservation under exchange between the system and its environment. 
Photons do not fully correspond to the classical concept of particles, and one implication 
of this is that black-body radiation has zero chemical potential. 


13.3.2 Photon Energy Density and Flux 


The Planck spectrum allows us to calculate the total energy of black-body radiation per 
unit volume e = (E)/V. This is 


e = 



u(ca) dot = 


h r°° co 3 dca 
c 3 n 2 Jo exp (ff)-1 


(, kT ) 4 r°° x 3 dx 

c 3 si 2 h 3 J 0 (e* — 1)’ 


(13.19) 


having inserted x = Hco/kT. The integral can be shown to be equal to it 4 /15, and so the 
energy density is 

4 o A 

e = — T\ (13.20) 

c 


where er = Tt 2 k 4 /(60c 2 7i 3 ) = 5.67 x 10~ 8 Wm~ 2 KT 4 is known as the Stefan-Boltzmann 
constant. 

For a gas, the flux of particles through a plane is known from kinetic theory to 
be </> = nv/4, where v is the mean speed and » is the particle density. In the photon 
picture, the mean speed of the particles is c, and so the flux of photons with frequencies 
in the range ca —> oa + dca is cn(ca) d<w/4. The energy flux associated with this cohort is 
hcacn(ca)doa/4 and therefore that associated with all frequencies is 


0 E 


c f°° c f c 

— - J heart (bo)d(i> = - J 


u(oa)do) = e — oT 3 


(13.21) 


to * J o 4 

Black bodies therefore radiate energy according to the fourth power of their temperature, 
matching the experimental observation known as Stefan’s law. The increase in area under 
the Planck spectrum n(&>) due to an increase in temperature is illustrated in Figure 13.2. 


13.3.3 Photon Pressure 


Black-body radiation exerts a pressure, further supporting an interpretation in terms of 
a photon gas. It is also known as radiation pressure. We employ (8.22): 


P 


= kT 


ainZ 

dV 


T 


(13.22) 


and (13.12), so that 



(13.23) 


We integrate by parts and refer to (13.19) to obtain 


1 hoa 3 1 4cr , 
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(13.24) 
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This expression may be combined with Stefan’s law (13.20) such that the pressure 
and energy flux are related by p = 4<p E /(3c). Apart from a difference in numerical 
factor, this is compatible with the idea that when radiation is reflected from a wall, the 
pressure exerted is proportional to the momentum flux carried by the photons, and this 
is proportional to the energy flux since the momentum of a photon with wavevector k is 
Hk = hco/c, while its energy is ha>. 

Photon pressure acts on the walls of a cavity containing the radiation, or if it escapes 
from the cavity, on any surface upon which the radiation falls, in proportion to the 
incident intensity. Notice the contrast with the relation p — 2e/3 found for the gas of 
nonrelativistic particles, derived, for example in (2.5) and later in (12.11). The effect 
is typically very small, for example if the walls of a cavity are at 10 3 K, the radi¬ 
ation pressure inside is only about 10 4 Pa, or one billionth of an atmosphere, but 
even the pressure of radiation from the sun can have an effect on the trajectories of 
spacecraft over long distances, and can even be exploited for propulsion, at least in 
principle. 

13.3.4 Photon Entropy 

Finally, we consider the entropy per unit volume of a photon gas. Using the Gibbs 
formulation from (8.11), we write 



Notice that the Gibbs free energy of the photon gas is zero: 

G = (E) — TS g + pV 

= (E) - 1(E) + t{E) = 0, (13.26) 

a result that is compatible with the earlier conclusion that the chemical potential for a 
photon gas is zero, since G = (/V )/i. Photons can be added to the system by changing the 
temperature of the heat bath: there is no such thing as a photon bath with a controllable 
chemical potential because photons are not conserved particles. 

Furthermore, the Helmholtz free energy F = (E) — TS is equal to —(E)/ 3, and the 
Helmholtz free energy density / is given by —e/3. Thermodynamic relations such as 
(3.35) as well as Maxwell relations such as (3.37) may be verified for a photon gas. The 
thermodynamic properties of photons are illustrated in Figure 13.3. 

We can add to our photon gas interpretation by noting that photons carry an entropy 
flux <p s — 4<p E /(3T) since the entropy density is proportional to the energy density 
divided by temperature. When radiation emitted from a high temperature source is 
absorbed by an object and re-emitted at a cooler temperature, the energy flows balance 
in a steady state, but the entropy flows do not because of the disparity in temperatures 
of the radiation. The emitted entropy flux is greater than the absorbed flux. 
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Figure 13.3 Dependence of photon gas thermodynamic properties on temperature. 

The emitted photons have a lower mean frequency, according to Wien’s law, and 
therefore a lower mean energy than the absorbed photons, such that in order to conserve 
energy there must be more photons emitted than absorbed. This creation of particles, 
with the implied increase in uncertainty of microscopic configuration, ties in with the 
conclusion that entropy is being generated by the absorber/emitter as it interacts with the 
radiation, which is what we would expect since the process is clearly out of equilibrium, 
and therefore should satisfy the second law. 


13.4 The Global Radiation Budget and Climate Change 

The sun is effectively a cavity that leaks black-body radiation at a temperature of about 
5800 K. The intensity falls in proportion to the inverse square of the distance from the 
sun, but the escaping photons retain a Planck spectrum characterised by this temperature, 
as there are very few objects with which they can interact before they arrive at the 
top of the terrestrial atmosphere. However, the situation is different at ground level, 
owing to the absorption and re-radiation of photons from components of the atmosphere, 
including clouds, dust, aerosols and individual gas phase molecules, as well as reflection, 
particularly from clouds. As we noted in the previous section, these are in general entropy 
generating, nonequilibrium processes. 

Gaps open in the Planck spectrum of the incident radiation caused by particular 
absorption mechanisms, for example in the ultraviolet part of the spectrum that interacts 
with stratospheric ozone. The Earth’s surface reflects some of the remaining radiation, 
depending on the frequency, but absorbs the major part of it and re-radiates at a range 
of temperatures depending on the local climate, latitude as well as other geographical 
and meteorological features. The terrestrial emission is similarly reflected, absorbed and 
re-radiated by the atmosphere as it propagates outwards. The so-called global radiation 
budget, expressing the balance of radiant energy coming in and going out, involves much 
complex science, but it is easy to demonstrate that the transfers give rise to an important 
warming effect that has sustained life on this planet and should be interfered with at 
our peril. We refer here to the well-known mechanism of the greenhouse effect and the 
prospect of anthropogenic climate change. 
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A simple model provides a rough estimate of the effect. We start by considering the 
balance between incident radiation from the sun and emission from the Earth, under 
the assumption that the atmosphere does not participate in the radiative transfer process. 
The incident radiative flux, seasonally and globally averaged, is about 342 Wm -2 . Some 
of this is reflected from the surface and we estimate this fraction, or planetary albedo, to 
be 15%: the true Earth albedo is more than 30%, but this includes reflection from clouds, 
and here we are ignoring this contribution. So the temperature T s of the terrestrial black- 
body radiation is given by oT 4 = 342 x 0.85 Wm 2 , where o is the Stefan-Boltzmann 
constant, and thus T s ~ 268 K. This is some 20 degrees lower than the actual mean 
terrestrial surface temperature. 

Now consider an atmosphere that interacts with radiation. To a rough approximation, 
the incident radiation is affected just by additional reflection, particularly off clouds; 
raising the planetary albedo to about 30%. Relatively little absorption and re-radiation 
takes place at the high frequencies of the incident Planck spectrum. However, matters 
are very different at lower frequencies characteristic of terrestrial emission. To first 
approximation we shall assume that all of it is intercepted and re-emitted as black-body 
radiation at an atmospheric temperature T. d . Half of this emission is directed back towards 
the Earth, and half out into space, as illustrated in Figure 13.4. It is the latter part that 
balances the incoming solar radiation, such that crT d = 342 x 0.7 Wm 2 , in which case 
T a ~ 255 K. The surface temperature is then determined by a balance between the gain 
and loss of energy from the atmosphere: 

oT 4 = 2crT a , (13.27) 

such that T s = 2 1/4 T a « 303 K. 

In this simple case a warming of over 30 K has been provided by atmospheric 
participation in the radiative balance. Such a model cannot be expected to be fully accu¬ 
rate, of course, as it has neglected many important details. A significant flux that is absent 
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Figure 13.4 A simple model of the greenhouse effect. Radiative transfer for the Earth system 
with no participation by the atmosphere, on the left, is contrasted with the more realistic but still 
simplified situation on the right, where greenhouse gases, aerosols and clouds interact with the 
terrestrial radiation. 
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in the analysis is the roughly 10% of terrestrial radiation that is not absorbed by the atmo¬ 
sphere and is emitted directly to space, according to present estimates. This is the com¬ 
ponent of radiative transfer that can be intercepted by anthropogenic radiatively active 
atmospheric components: additional greenhouse gases including carbon dioxide and 
methane, as well as extra aerosol or haze particles, or indeed additional cloud. The assess¬ 
ment of the overall effect on the global radiation budget is one of the most complex tasks 
currently being undertaken in science, but various studies have left little doubt that there 
is significant potential for disturbance of the terrestrial climate due to human activities. 


13.5 Cosmic Background Radiation 

Solar radiation is just an approximate version of the Planck spectrum, but the Earth is 
also illuminated by radiation that is extraordinarily close to the Planck shape, with a peak 
frequency in the microwave region. It is very nearly isotropic, meaning that it arrives 
from all directions in space at almost equal intensity, when certain biases such as the 
motion of the Earth is taken into account. This so-called cosmic background radiation 
is believed to consist of photons emitted in the early universe. It is a faint glow from 
the Big Bang, and one of the principal reasons why we are confident that such an event 
took place. 

When the universe was extremely hot and dense, according to current cosmological 
models, photons coupled strongly to a plasma of charged particles and assumed a Planck 
spectrum at the evolving temperature. The ongoing expansion cooled the universe, in 
the normal way of an adiabatic, though possibly nonquasistatic, expansion of a system, 
and the density of matter fell until electrons found it thermodynamically favourable to 
combine with nuclei to form atoms, a process known as recombination. The reduction in 
charge density caused the photons to decouple from matter. The confining cavity walls 
melted away, at it were, leaving behind photons characterised by the temperature of the 
plasma at recombination, estimated to be about 3000 K. 

These photons have since then propagated through the thinning vastness of space with 
very little change in the Planck spectrum, and these are what we detect today, except for 
one feature: the temperature of the radiation has cooled to about 2.725 K. The uniformity 
of the temperature of the radiation is very striking, with variations of just 10 4 K. The 
greyscale view of the entire sky shown in Figure 13.5 is a now-famous portrayal of the 
tiny fluctuations as speckles visible against a uniform temperature background, obtained 
by the Wilkinson Microwave Anisotropy Probe (WMAP). It is essentially an image of 
the largest and most featureless physical object we have ever seen. 

The usual interpretation of the cooling is that it is space itself that is expanding. A 
picture where matter and photons are expanding into a fixed spatial arena cannot apply 
because, as with the escape of radiation from the sun, such a process would reduce 
the intensity but not change the temperature of the spectrum. We must imagine that the 
photons, classically regarded as spatially extended disturbances of the electromagnetic 
field, have distorted with the underlying space on which they propagate. Their wave¬ 
lengths increase with time, but since co = ck = 2itc /X, this means that their frequencies 
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Figure 13.5 Representation of spatial fluctuations in the temperature of cosmic background 
radiation. The lightest and darkest patches are hotter or cooler by just a few parts in 10 4 . Source: 
Reproduced from WMAP, http://map.gsfc.nasa.gov/. 


decrease. The peak in the Planck spectrum has moved towards lower frequencies, and 
according to Wien’s law, this corresponds to a reduction in the characteristic temperature 
of the radiation. 

Another view to take is more akin to traditional thermodynamics. We consider the 
expansion of patches of space against a pressure exerted by other patches, with the pres¬ 
sure deriving from the electromagnetic radiation carried within each. Since the properties 
of each patch are identical, there is no heat transfer between them. Assuming that the 
expansion is quasistatic, the first law therefore takes the form d(£> = —pdV ; insert¬ 
ing the relation p = (E)/(3V) for radiation, this becomes d (E)/(E) — — (l/3)dV /V or 
(E) oc V-'/ 3 , and as (E) oc VT 4 , this means that T oc V -1 / 3 . 

The latter relationship may also be obtained through (13.25) by requiring that the pho¬ 
ton entropy of the universe is constant as it undergoes a quasistatic adiabatic expansion. 
An increase in the linear dimension of the universe by three or four orders of magnitude 
over the period since recombination would account for the cooling in radiation from 
3000 K to the present day value. This interpretation is slightly hard to sustain, though, 
since the radiation is supposed not to interact with matter during the expansion, and so 
it is difficult to argue that thermal equilibrium is continually re-established. 


Exercises 

13.1 Black-body radiation is contained within a box of volume V and temperature T. 
Calculate the energy density of the radiation at 400 K. Calculate the photon density 
and hence the mean energy per photon at this temperature. You may assume that 
/ 0 °°x 3 /(exp(jc) — 1 )dx = tt 4 /15 and / 0 °° x 2 /(exp(x) — l)dx ~ 2.404. 

13.2 Above what temperature, approximately, would the photon density exceed the typ¬ 
ical molecular density of air in a sealed room? 

13.3 Estimate the entropy production associated with the absorption and re-emission of 
radiation by the Earth. 
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13.4 An evacuated and rigid container with volume V at a temperature T contains 
black-body radiation. The container is placed in thermal contact with a heat bath at 
temperature T r . If the heat capacity of the cavity material itself is negligible, show 
that the overall change in entropy of the universe after the system and heat bath 
have reached thermal equilibrium is A,S tot = 4erVT 3 [l — r 3 (4 — 3r)]/(3c) where 
t = T/T r . Comment on the sign of AS tot as a function of t. 


14 

Statistical Thermodynamics 
of Interacting Particles 


Having developed the statistical thermodynamics of ideal gases in both classical and 
quantum regimes of behaviour, we now discuss a broader perspective that includes 
systems of interacting particles. The areas of application are vast, but we must be brief. 


14.1 Classical Phase Space 

Modern presentations of statistical thermodynamics focus on the quantum point of view 
because it has enormous advantages over a classical treatment with regard to the enu¬ 
meration of the multiplicities of microstates and the summation of partition functions, 
since the states are discrete. But this was not how statistical thermodynamics was initially 
developed by Boltzmann and Gibbs. In classical mechanics, microstates of a system form 
a continuum specified by the classical dynamical variables of the constituent particles. 
In what sense can we count them? 

The classical procedure is to regard a summation over all microstates as an integral 
over the classical phase space of the system. This is a multidimensional space for which 
the positions and momenta of the constituent particles provide the coordinates. The 
instantaneous state of the system corresponds to a single point in this space that follows 
a continuous trajectory as time progresses, as determined by the equations of motion. 
We used such an integral in Section 6.1.2 when we considered the statistical properties 
of a classical oscillator. 

For a system consisting of a particle described by position x and momentum p and 
with total energy E in the range E —> E + dE, the multiplicity of microstates may be 
written as dQ(E) = g{E)AE with 

g(E)= [ dpdxp(p,x)S(H(p,x) - E), (14.1) 
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where we have included a density of microstates p{p,x) over the phase space, such 
that p(p,x)dpdx is the number of microstates in the range x —> x + dr and p —> p + 
dp. The isolation of the system is represented by the delta function that constrains 
the Hamiltonian function of position and momentum to be equal to the energy. For 
example, the Hamiltonian of the 1-d harmonic oscillator is H(p,x ) = p 2 /(2m) + kx 2 /2. 
The canonical partition function of such a system in contact with a heat bath is written as 

/ OO 

dpdxp(p,x) exp 

-OO 

In earlier examples, we assumed p to be a constant, on the grounds that quantum mechan¬ 
ics showed that this would be appropriate, but there is also a classical argument in support 
of this assumption, based on the idea that the density of states in phase space should not 
change if we decide to shift the origin of coordinates such that x —> x + x 0 and p —> p + 
p 0 . Such an invariance rules out any variation in p, and we set it equal to a constant h {) 1 . 
As an example, let us evaluate Z for the 1-d oscillator Hamiltonian 


H (p,x) 
kT 


(14.2) 


Z = 



exp 



ck exp 



1 i 

= — (2nmkT)2 
hn 


m 



(14.3) 


where o> is the angular frequency of the oscillator, given by (k/ in ) l/2 , and where we 
have used f_ dr exp(— ax 2 ) = (it/a) 1 / 2 . 

The classical (high temperature) limit of the canonical partition function of the quan¬ 
tum oscillator in (6.24) is 


ex P ( 2*7') ^ kT 

1-exp (—|0 ~ tuo’ 


(14.4) 


for kT hto, and we see that not only does the assumed constancy of p yield the 
correct functional form, but we also learn that the appropriate factor to use in the classical 
partition function is h 0 = h: Planck’s constant. In classical calculations it does not matter 
what value for h 0 is taken, as most of the thermal properties are obtained by taking 
derivatives of the logarithm of Z, and h 0 therefore disappears. Only in the absolute 
values of the entropy and chemical potential does it remain, but even here it cancels out 
when taking differences in these quantities. 

In general, the classical canonical partition function of a system of N particles in a 
3-d space may be written as 


Z 


I 

h 3N N ! 




(14.5) 


where the Hamiltonian H is a function that corresponds to the energy, dependent on the 
positions and momenta of the particles, and where a factor of 1 /N ! has been inserted to 
account for particle indistinguishability, this being the correct factor for a system with 
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a particle density well below the quantum concentration. We would not impose such a 
correction if the particles were distinguishable. 

Equation (14.5) goes beyond the treatment of gases or independently tethered 
oscillators, because the Hamiltonian can include particle-particle interaction terms. It 
is a starting point for calculating the statistical properties of general classical systems, 
and hence a powerful tool in the broad field of theoretical condensed matter physics. 


14.2 Virial Expansion 


Let us now study a system of two particles of equal mass that interact through a 
potential (p(r l2 ), where r l2 is their separation. The Hamiltonian is // (p,,p 2 , x,, x 2 ) = 
|pj| 2 /(2m) + |p 2 | 2 /(2m) + <p(\x l — x 2 |) and the canonical partition function is 


Z 2 


1 

2! 


j n d P dx ' ex p 

/=] 



(14.6) 


The integration is performed over all values of momenta, and all particle positions within 
a box of volume V. 

The momentum integrals are of the form f dp sxp(—p 2 /(2mkT)) = ( 2nmkT ) 1//2 
and we can transform to spatial coordinates R = (xj + x 2 )/2 and r 12 = x, — x 2 , with a 
Jacobian of unity, such that 


Z2= 2l 


1 ( 2i\.mkT \ 


h 2 


7 


d 3 Rd 3 r 


12 ex P - 


<P(r n ) 

kT 


= 2 y 2 n ^. 


(14.7) 


where n q 
defined 


is the quantum concentration. Assuming the integral converges, we have 

<P(r i2 y 


-H 


— / 4it r^ 2 exp 


kT 


dr 


12 ’ 


(14.8) 


which would be unity if the interaction potential were zero, and otherwise expresses 
the deviation from the behaviour of two noninteracting classical particles, which would 
be represented by a partition function Z'f = (1/2)(« V) 2 = (l/2)(Zj lg ) 2 following the 
pattern set by (9.19) and (9.13) with spin s — 0. 

It is convenient to write 


a = 1 + — [ 4izr 2 f(r, T)dr = 1 + ^11 

V Jo V 


in terms of a temperature-dependent quantity 33(T), where 

/(,.« = exp 


is called the Mayer function, such that 

1 


Z 2 =~V 2 n 2 (T)[ 1 


£(T) 


(14.9) 


(14.10) 


V 


(14.11) 
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from which thermodynamic properties follow. For example, the pressure is 


P 



2 kT 

- kT 

V 



£ 

V 2 ’ 


(14.12) 


and we can regard the second term as a nonideal correction to the ideal gas pressure 
represented by the first term. 

The procedure can be extended to a gas of N particles in a 3-d box. The partition 
function is now 


z » = / n 


/ E ( P( r u)\ 


crx- exp 


=i 

N 


J>i 


kT 

<P(r,j ) 


n-p (- 

i=l j>i 

1 f ^ 

j n^ n ^ 1 + 


=i 

N 


n 

i = l 


d\. (1 + /(r 12 ,r»( 1 + /(r 13> T)) ■ 


N\ q 


V 1 ' 


N(N - 1) 


/flA 

1=3 


d 3 x 1 d 3 x 2 /(ri 2 ,T) 


- — (V n) 
N\ q 


1 + N(N , V 2 ^ / d3 Xi d3 x 2 ( exp 


kT 


(14.13) 


where r- is the distance between particles i and j. On the second to last line, only 
contributions to the integrand involving a single / function are retained, and it is recog¬ 
nised that there are N (N — l)/2 of them corresponding to the number of particle pairs. 
We therefore write 


, N (N - n 

1 H--- -£ 

2V 


(14.14) 


where Z ' h f = {Z'^) N /N\ and we obtain the pressure 


p = kT 


9 In Z N 
dV 


T,N 


NkT 

- kT 

V 


1 + 


N N - 11 

—- -£ 

2V 


N(N - 1)18 
ZV~ 2 


nkT - kT Bn 2 . 

2 


(14.15) 


We have inserted n = N/V and assumed /V I. We take N 2 B/V to be small and have 
neglected terms proportional to « 3 and beyond. 
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If we compare (14.15) with the virial expansion of the equation of state of a nonideal 
gas in (3.42), it is apparent that we have derived a microscopic form for the second 
virial coefficient: 


f 


B 2 (T ) = —fB = I 2nr z I 1 — exp 



that appears in the equation of state 


w = X> (7V - 


(14.16) 


(14.17) 


Our interest only in the quadratic correction to the ideal gas law in (14.15) is the reason 
why we neglected terms involving /(r 12 )/(r 13 ) in the integrand of the partition function 
(14.13). 

For an attractive hard sphere interaction potential where </>(r) is equal to infinity 
for r < r p , corresponding to a strong repulsion at that particle separation, and small in 
magnitude in comparison with kT for r > r p , as illustrated in Figure 14.1, we can write 


2n ■, 

B 2 (T) = —rj + 

nOO 

/ 2itr 2 

/rp 

1 — exp 

kT )_ 

dr 


2 tt , 

* T rp + 

r2*r’* (r v=* 

/rp kT 

a 

~ kT' 


(14.18) 


which then takes the form argued in (3.61) on the basis of classical thermodynam¬ 
ics. Thus the b parameter in the van der Waals equation of state is of the order of 
the particle volume and a is a measure of the mutual interaction energy of a particle 
pair. If (f> took a different form, such as the Lennard-Jones expression also shown in 
Figure 14.1, or if <p(r) ~ kT, the second virial coefficient would have a more elaborate 



Figure 14.1 Two commonly chosen interaction pair potentials: in red the so-called 
Lennard-Jones 6-12 potential with range parameter r 0 , and in blue an exponentially decreasing 
attractive part, specified by parameter A, with a hard repulsion. 
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temperature dependence and bear a more complicated relationship to the strength and 
range of the interaction. 

But for the case of weakly attractive hard spheres, we have 


Z N « z; s (l - nNB 2 ) = Zlf [l - nN (b - 


(14.19) 


The mean energy of the gas per unit volume e = yr d1 "^ follows the same form as 
(3.60): 

3 , 9 d B 7 (T) 3 , f°° , 

e(n,T ) R» -nkT — n~kT~ — -R» - nkT + n~ / 2sir~<p(r)dr, (14.20) 

2 dT 2 J,. p 

where the second, nonideal term is clearly a potential energy to supplement the kinetic 
energy of the first term. The dependence on n 2 arises because we are considering a 
pairwise interaction and therefore the energy should be proportional to the number of 
particle pairs in the system. 

The entropy of the gas follows (3.57), namely 


S G = 


(£> 

T 


■ k In Z N = Sq 


N~k_ &{TB 2 (T)) = ^ _ 
V dT G 


(14.21) 


suggesting that the interactions reduce the entropy of the gas, per particle, by an amount 
proportional to the volume Nb that each particle is unable to explore owing to their 
mutual repulsion. 

By including more terms in the partition function, further contributions to the pressure, 
energy and entropy of the gas can be obtained, corresponding to higher virial coefficients, 
but they rapidly become complicated to compute. Nevertheless, they are in principle 
calculable from information about microscopic interactions. 


14.3 Harmonic Structures 

The partition function of a system of interacting particles (14.5) can, in principle, provide 
us with all the statistical thermodynamic properties we need, but there are only a few 
cases where exact results can be obtained. Typically, approximations have to be made, 
such as making a virial expansion in density. However, harmonic interparticle interactions 
are an exception, and we shall explore how the thermal properties of a solid can be 
modelled using such an approach. 

By harmonic interactions we mean pairwise quadratic potentials, usually involving 
near neighbours in a structure. Harmonic restoring forces become arbitrarily strong as 
the participants move further apart, such that each interaction can be regarded as an 
unbreakable bond. A snapshot of the system might be sketched as a network of connec¬ 
tions as shown in Figure 14.2, and this has an important effect on the way we construct 
the partition function. Each particle is labelled by its position within the network. The 
dynamics are incapable of swapping particle positions in a manner that would create a 
configuration that is indistinguishable from another. The factor of 1 /N ! therefore need 
not appear in the partition function. 
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Figure 14.2 A network of unbreakable harmonic bonds linking atoms in a general structure, and 
in a triatomic linear molecule. 


14.3.1 Triatomic Molecule 

As an example, let us consider a molecule consisting of three atoms of equal mass, 
as illustrated on the right in Figure 14.2. Harmonic interactions with spring constant k 
exist between nearest neighbours and the Hamiltonian is f/ 3 = (p 2 + pf, + pj)/(2m) + 
(l/2)/r(x 1 — x 2 — r 0 ) 2 + (1/2)k(x 2 — x 3 — r 0 ) 2 where r 0 is the particle separation that 
minimises each bond energy. The atoms are constrained to lie on a line of length L 2> r 0 . 
The partition function is 


Z 3 



(14.22) 


Such an integral can be evaluated by a suitable transformation of the x,- to so-called 
normal coordinates in terms of which the Hamiltonian may be written as separate 
quadratic terms. By inspection, these are 


x 12 = x l — x 2 , x 23 = x 2 — x 3 , X = j(xj + x 2 + x 3 ), 

for which the Jacobian is unity. The partition function becomes 

(2tt mkT)l ( (/f(x 12 -r 0 ) 2 + /f(x 23 -r 0 ) 2 )' 


Z 3 = 


h 2 


■S' 


dx 12 dx 23 dZ exp 


[27t(3/?i)fcr]2 L 
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i 

32 L 


2iunkT 


h 2 


2nkT 


K 


2kT 

L ( kT 


KhO m ) 


kT 

Tlu> 


(14.23) 


— , (14.24) 


where co = (jc/m) 1 ' 2 . The separability of the Hamiltonian gives rise to factors in the 
partition function that represent the 1-d translation of the centre of mass X of the 
molecule, of form [L/A th (3/72)] analogous to the 3-d version for a particle of mass m 
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in (9.13), with A th (M) = h/(2nMkT )'/ 2 , and two internal vibrational modes each rep¬ 
resented by a classical partition function of a 1-d harmonic oscillator, as in (14.3), but 
with frequencies co and \f3o>. These are called normal modes of oscillation. 

Such a partition function gives us the classical mean energy of the system 

, /ainZA , / 5 \ 5 

{E) = kT XHr 1 )= kT (rr) = 2 kT ’ {U25) 

in accordance with the equipartition theorem, since there are five quadratic terms or 
degrees of freedom represented in the Hamiltonian H 3 . The entropy is S G = (E)/T + 
k\nZ 3 , and since Z 3 oc k 1 , this increases if k is reduced, at constant T, in line with our 
intuitive expectation that a weaker set of bonds allows greater exploration of different 
microscopic configurations. 


14.3.2 Einstein Solid 


It is an intuitive leap to claim now that the partition function of any harmonically bound 
set of particles in 3-d space may be written as a product of expressions describing the 
translation of, and rigid rotation about, the centre of mass, that we shall immediately 
neglect, together with factors corresponding to all the normal modes of oscillation of 
the structure. We write Z vib = |~[ ( Z and recognise that there are 3N — 6 vibrational 
modes. There are six coordinates per particle in three dimensions, and therefore 6 N — 6 
degrees of freedom of relevance to the equipartition theorem as there are no quadratic 
potential energy terms in the Hamiltonian associated with the three spatial coordinates 
of the centre of mass position or the three coordinates that describe the orientation of 
the structure with respect to specific axes. Six of the degrees of freedom correspond to 
the kinetic energy of bulk translation and rotation, leaving 6N — 12 for vibration, two 
for the potential and kinetic energy of each of the 3N — 6 modes. 

The partition function Z m corresponding to each mode will take the form (6.24) appro¬ 
priate to a quantum oscillator. We can then write 


lnZ vib = X! lnZ "; 
i 



g(w) In Z ru dco, 


(14.26) 


in terms of a density of states of vibrational modes g(a>), in a manner familiar now from 
our treatments of particles or electromagnetic modes in a box. However, the evaluation 
of the spectrum of vibrations, or equivalently the density of states g(co), is rather com¬ 
plicated. Einstein made the very simple approximation that all vibrational modes had the 
same frequency co E , as explored in Section 6.6. Such a step, together with the use of the 
partition function of a quantum oscillator at frequency &> E , leads to the Einstein model 
of the thermal heat capacity of a solid 


3 Nh 2 co E 

4/c7’ 2 sinh 2 [j§\ ’ 


(14.27) 


given earlier in (6.29), in which the difference between 3N — 6 and 3 N is neglected. 
However, such an assumption is too bold and the model does not match experimental 
data. The Debye model, that we discuss next, is much more realistic. 
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14.3.3 Debye Solid 

Peter Debye (1884-1966) developed a model of the heat capacity of solids that is superior 
to Einstein’s. He proposed that the spectrum of vibrational modes of a harmonically 
bound structure should take the same form as the spectrum of sound vibrations in the 
solid, a natural choice and quite accurate, at least for the lower frequencies or larger 
wavelengths of the corresponding disturbances. 

In a continuum treatment of elastic solids, longitudinal and transverse sound waves 
may be transmitted at all frequencies at respective speeds v* and v', related to the elastic 
constants of the material. Debye proposed that the density of states of oscillatory modes 
took the same form as the density of modes of electromagnetic waves in a box, but with 
the speed of light replaced by an average speed of sound v s = (l/3)Vs + (2/3)v{, where 
we recognise that there are two transverse modes and only one longitudinal mode. Thus 
from (13.5) we have 

3V w 2 

8 (®) = ^T2. (14-28) 

ZVg 7T“ 

for a solid of volume V, where the factor of three accounts for the three modes. In order 
that the total number of modes should equal 3/V — 6 ~ 3/V, this spectrum has an upper 
limit at the so-called Debye frequency &> D given by 

/“" D 3F p 13 , Vwl 

3 N = / g{w)Aw = —j-j / or Aw = —(14.29) 

Jo 2v s 3 jt 2 Jo 2 vJtx- 

and so the Debye frequency may be written in terms of particle density n as w D = 
v s (6jt 2 n) 1 / 3 . 

The partition function is therefore given by 

3 V /‘" D , 

lnZ vib = 0 3 2 / w 2 lnZ m dw, (14.30) 

2vJti z Jo 


with Z w = [2 sinh(?wu/l/2)] 1 from (6.24), and thermodynamic properties can then be 
determined. From (6.17), the vibrational heat capacity is 
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(14.31) 


At high temperature, such that hw n /kT 1, the approximation si n h (//rw/i/2) ~ ho>fi/2 
can be used and we obtain 


Cv 


3 Vk /’" D 2 3 Vk <w 3 

2^2 Jo Wd<B “2^y 


Vk 3 / 6jt 2 N \ 

2^ Vs ) 


= 3Nk, 


(14.32) 
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as would be expected in the classical limit, while for low temperature we write 


C v — 


3 Vh 2 


/ kT \ 5 r n 

r U J Jo 


a 3VK{K1 ) i a 

- dx ^ - / - 
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The integral is equal to 16tc 4 / 15, and so the low temperature heat capacity 


dr. (14.33) 


is 


Cy 
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(14.34) 


where we define the Debye temperature To = fioi n /k. This is in excellent agreement 
with the temperature dependence of solids at low temperature. The Debye heat capacity 
is sketched in Figure 14.3, showing the approach towards the classical result 3Nk at high 
temperatures. It might superficially resemble the Einstein heat capacity shown in Figure 
6.5 but the dependence C v oc 7’ 3 at low temperature shown in the inset is distinctive 
and matches the data much better. 

Finally, we determine the vibrational entropy of the Debye solid using 


S(T) = 



Cy(T') 

T' 


dr', 


(14.35) 


together with (14.33), and plot it in Figure 14.3. We find that S ~ In 7 at high tempera¬ 
tures when C v —> 3Nk while at low temperatures S = (4/5)it 4 M(7/7 D ) 3 , as indicated 
in the inset of Figure 14.3. This resembles the temperature dependence of the entropy of 
the photon gas in (13.25), for reasons that will be clear when considering the similarities 
in derivation. Quanta of vibrational energy are called phonons, and consequently this 
may be regarded as the entropy of a low temperature phonon gas. 



Figure 14.3 Temperature dependence of the vibrational heat capacity (blue) and entropy per par¬ 
ticle (green) according to the Debye model. The T 3 behaviour at low temperature is demonstrated 
in the inset using logarithmic axes. 
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Exercises 

14.1 Evaluate the second virial coefficient that corresponds to an attractive square well 
potential between pairs of particles, such that <p(r) = —<p 0 for r < r 0 and <p(r) = 0 
for r > r 0 . Determine the entropy of the gas in the classical regime. 

14.2 Evaluate the classical partition function of two particles moving in two dimensions 
and interacting through a harmonic potential, and identify the factor of the partition 
function that corresponds to rotation about the centre of mass. 

14.3 Evaluate the classical partition function Z 4 of a linear molecule of four particles 
interacting through harmonic nearest neighbour forces and factorise it in a manner 
similar to (14.24). 

14.4 Evaluate the entropy per particle of the Einstein solid at low temperature. 


15 

Thermodynamics away from 
Equilibrium 


The next few chapters are devoted to the topic of entropy generation in systems that are 
out of equilibrium. We examine the traditional modelling of this phenomenon, involving 
an extension of classical thermodynamics, and then discuss how the same process might 
be viewed within statistical thermodynamics. 


15.1 Nonequilibrium Classical Thermodynamics 

The rate of internal generation of entropy in a system exposed to a reservoir in classical 
thermodynamics was given in (2.67) as 

_ /_1_ 1 \ d£ (pit) p r (f) \dF / /x(r) ^ r (t) \dN 

d t~\T(t) T r (t)) dr \T(t) T r (t)J dt \T(t) T r (t) ) dt ’ 

which takes the form of rates of change of extensive system properties E, V and N 
multiplied by differences in certain intensive properties of the system and reservoir. 
This expression is intuitively valuable, but is based on several extensions to the mean¬ 
ing of thermodynamics. The system is assumed to possess a temperature and chemical 
potential even though it is not in equilibrium with the reservoir. The meaning placed 
on (15.1) requires us to employ a rather more general thermodynamics, one where the 
state variables serve as characteristics of nonequilibrium systems, while retaining their 
mathematical relationship to one another, and where equations such as (15.1) relate their 
evolution in time. 

15.1.1 Energy and Particle Currents and their Conjugate Thermodynamic 
Driving Forces 

We now develop a general framework for these ideas. Figure 15.1 illustrates a network 
of systems between which energy and particles are exchanged. For simplicity we do not 
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Figure 15.1 In nonequilibrium thermodynamics we consider exchanges of energy and particles 
between systems that are considered to be large enough to be in a state of local quasi-equilibrium, 
but unlike reservoirs, their temperatures and chemical potentials respond to the flows and become 
time dependent. Currents flow between the systems, and entropy is generated. 


consider volume exchanges. The rates of transfer of energy and particles from system 
a to system /I are to be called currents and are denoted J F and , respectively. The 
notation implies that J F = —J F . Such transfers between systems allow us to develop 
evolution equations for the energy and particle number associated with each system: 


d£^ 
d t 



dNg _ \ jPa 

dr N 


(15.2) 


The first of these is equivalent to familiar expressions used to model the equalisation of 
temperature between systems. The second relation is analogous to the conservation of 
charge in electrical circuits, and bears a resemblance to Kirchhoff’s current law. 
Following (15.1), the transfers give rise to entropy production, namely 


d Sf 
dr 



(15.3) 


Each system in the network is presumed to be in a state of quasi-equilibrium, and hence 
can act as a reservoir for the transfer of energy and particles to a linked neighbour. 
The systems are finite in size, and so their temperatures and chemical potentials evolve 
with time, but they act like reservoirs in that we consider the entropy production to be 
taking place in the links between the systems, and not in the systems themselves. It is 
through these links that the currents pass and across which the differences in intensive 
thermodynamic variables apply. 

Next we consider a very particular network of systems of equal volume, arranged 
in a line along the x-axis and coupled by nearest neighbours as shown in Figure 15.2. 
For simplicity, we ignore particle exchange for now and consider only the effect of 
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T a T„ 



Figure 15.2 A set of systems characterised by temperatures T a , Ta, ■ ■ ■ arranged along the x-axis 
and exchanging energy according to currents J E etc. Entropy is generated in the links between 
systems, but we can also define a density of entropy production over the volume elements shown 
as dashed boxes, as well as current densities j E through the interfaces between boxes. 


energy transfers on the total entropy production of the network, which is 


dS 


tot 


dr 
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dsf 

At 




.up 


(15.4) 


taking the form of a sum over contributions from each link. The notation ft > a is to 
ensure that links are not counted twice. We now convert this sum into an integral by 
drawing space-filling cuboidal boxes of volume S V = A8x around each link, where A is 
the cross-sectional area and 8x the thickness. We define a density of entropy production 
As^/At = A{S^/8V)/At in the box containing link aft, and a current per unit area or 
current density = J E ^/A, such that 


AS 


tot 


d t 
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dsf 

dt 


8V = J2 SV 


i 

Sx 




(15.5) 


Next we consider small 8x while insisting that each system is large enough to be char¬ 
acterised by thermodynamic variables, such that we can write 


^ = / dv s(fl^ w ' 


(15.6) 


where we have now introduced a spatial gradient of a continuous variable T 1 , and j Ex 
is the current density in the x direction. 

Extending this argument to three dimensions, and including particle exchange as well, 
we can write the total entropy production as the integral 
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where we have now introduced a particle current density j w . 


(15.7) 
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This is the central result of classical nonequilibrium thermodynamics. In a system 
where there are spatial flows of energy and particles and spatial gradients of temperature 
and chemical potential, the density of entropy production is given by 


d‘hot 

dr 


= V 




' Jn’ 


(15.8) 


which takes the form of current densities multiplied by the so-called conjugate thermo¬ 
dynamic forces. For example, the conjugate force associated with the flow of energy is 
the gradient of the inverse temperature. Entropy production is then distributed across 
space, but is most intense where gradients and flows are highest, namely where the 
system deviates most from equilibrium. 

Further development follows if we express the gradient of p/T in terms of those of 
inverse temperature and particle density n. We write 

d(p/T) = pd(T- 1 ) + T~ l dp = pd(T- { ) + dT + T-'(^j d n, (15.9) 

and so the spatial gradients are related by 



using the Maxwell relation (3.39), and so 


(15.10) 


(15.11) 


p- T (^) = P + t(™) = (”) +T (™) = (™) =e, 

\dT) n \dN ) TV ) v T ) v,T ydN/V'T 

(15.12) 

using (3.35) and F — E — TS , where e is the energy change associated with adding one 
particle to a system without a change in temperature or volume. If we recognise that the 
energy current density is a sum of components corresponding to heat flow and particle 
flow, or \ E = + ej,v, then we can write 


= v (^ ) ■ [Je+%] 


^tot 

dt 


T T \ dn 


V» 


J N 


- vr ' h 


f(lr I v " 


(15.13) 


where we can regard the conjugate thermodynamic forces associated with the heat and 
particle currents to be proportional to the temperature and particle density gradients, 
respectively, which is what we might have expected on the grounds of intuition. If we 
allowed system volume changes to take place as well, this would lead to an additional 
term on the right hand side to match the relevant term in (15.1). 

A system where heat flows as a consequence of a gradient in temperature is shown in 
Figure 15.3. Hotter regions are shown in darker colour, and the magnitude and direction 
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Figure 15.3 Contours of temperature and vectors of heat current density, together with the 
density of entropy production shown as a speckle pattern. 


of the heat current densities are indicated with arrows. The density of entropy production 
is represented by the speckles and is strongest where the gradients and flows are largest. 

The heat current density is commonly taken to be proportional to the temperature 
gradient: 

j G = -*V7\ (15.14) 

where k is the thermal conductivity, and similarly 


Sn = ~DVn, 


(15.15) 


is often used, where D is the diffusion coefficient. These are empirical expressions known 
as Fourier’s law and Fick’s law, respectively. More complicated transport laws might be 
imagined, with a particle current responding to a temperature gradient, for example, but 
here we ignore this for simplicity. Thus 


— = -^(VT) 2 + — (— ) (Vn) 


dr 


T 2 


T \ dn 


(15.16) 


and this expression makes it quite apparent that the local rate of entropy production 
in the flow is never negative, in accordance with the second law. This follows since k 
and D are both positive, and the chemical potential at constant T increases with particle 
density: using the Gibbs-Duhem equation (3.66), 


dp 

dn 


dp 

dp 


> 0 , 


(15.17) 


due to the fact that the volume per particle v is positive, and empirically the pressure of 
a system never decreases with an increase in particle density at constant temperature. 

The transport laws (15.14) and (15.15) also allow us to cast the energy and particle 
conservation equations (15.2) into differential form. We write E a = e a SV and 

—8V = V SV— y|“, 

At ^ Sx JE 


(15.18) 
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and for system y in Figure 15.2, this becomes 

_ 1 Pr _ :YK . ^ _ s Jex 

dr 8x E E dr <5x 


(15.19) 


using notation that suggests a spatial derivative of the energy current in the x direction. 
The generalisation to 3-d and a continuum of spatial positions is 

de 

y t = -V • j £ , (15.20) 

and similarly, the particle number density evolves according to 

dn 

¥ = -V-j iv . (15.21) 

A positive divergence of a current at a particular point corresponds intuitively to an 
overall flow away from that point, and hence a fall in the density of the appropriate 
conserved quantity, so the sign on the right hand side of these so-called continuity 
equations makes intuitive sense. Inserting (15.15), we deduce that 

— = V • (DVn) = DV 2 n , (15.22) 

3 r 

with the last form applying as long as D is the same at all spatial points. This is known 
as the diffusion equation. The counterpart for heat flow is obtained by combining (2.56), 
(3.44) and (15.2) and writing de = c v dT + edn where c v is the heat capacity per unit 
volume, as well as j £ = j q + ej v such that (15.20) becomes 

9 T 

c v — = -V ■ S q = kV 2 T, (15.23) 

assuming that e, c v and k are also constants. This final form is known as the heat 
equation. 


15.1.2 Entropy Production in Constrained and Evolving Systems 

Entropy is produced when thermodynamic state variables change nonquasistatically as 
a consequence of the flows of energy and particles. It is therefore associated with the 
evolution or relaxation of a system. But entropy is also generated if state variables are 
time independent but there are nonzero currents. This is a constrained system since it 
must be maintained away from equilibrium by external boundary conditions. It is known 
as a nonequilibrium steady state. 

We briefly describe an example of this mode of entropy production using three coupled 
systems a, /i and y illustrated in Figure 15.4. We consider energy transfers only, and 
employ an equation for the evolution of the energy of system ft of the form 


dE P _ T <*P _ jPy 

dr - E E 


(15.24) 


with energy currents given by a version of Fourier’s law (Newton’s law of cooling): 
j"/' = —K(Tp — T a ) where K is a constant. In a case of relaxation, systems a and y 
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Figure 15.4 The decline in the rate of production of entropy as a system relaxes towards equi¬ 
librium. In contrast, if T a = T 1 > T y — T 2 , then a nonequilibrium steady state would be possible 
with a time independent rate of entropy production. 


are held at a temperature T 0 and by introducing a heat capacity Cp = AEp/ATp we find 
that the time dependence of the temperature Tp of system ft is given by 

d Tp K 2 K 

~JT = 7-Uo-Tp- (Tp - To)] = - — {Tp- T 0 ), (15.25) 

ar c,p c,p 

and so Tp — T 0 — AT exp(—2 Kt/Cp), where AT is the value of Tp — T 0 at t = 0. From 
(15.4) the rate of entropy production is 



2TT(Ar) 2 exp(-^) 

r 0 [r 0 + Arexp(-^) 


j K(T 0 - Tp) 


(15.26) 


and this is plotted in Figure 15.4: it steadily declines as equilibrium is approached. 

In contrast, consider a case of a nonequilibrium steady state brought about by fixed 
boundary conditions T a = T l and Ty = T 2 < 7V By inspection, the steady state tem¬ 
perature of system /3 is given by (T, + T 2 )/2, and the steady state rate of entropy 
production is 


dS„ 


df 


= — - — \J, 


a 


- - — 


Py 


1 ( 1 

K (T { -T 2 )+[-- 


,T\(Ti + T 2 ) T 2 (T l + T 2 ) 


K K (T t 

[ Tl -T 2 ]=- 1 


K-(T { -T 2 ) 

t 2 ) 2 


T Ji 


(15.27) 
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which is positive, as expected. In general, a system would also generate entropy as it 
evolves towards such a constrained nonequilibrium steady state. 


15.2 Nonequilibrium Statistical Thermodynamics 

It will come as no surprise to learn that microstate probabilities are central to nonequilib¬ 
rium statistical thermodynamics. They might be time dependent, to allow a description 
of the relaxation of systems towards equilibrium, or they might be time independent but 
constrained, such that there is a mean flow of energy or particles through the system. 

Microstate probabilities readily enable us to construct the mean energy and particle 
number, but just as in classical nonequilibrium thermodynamics, concepts such as tem¬ 
perature, chemical potential and entropy do not necessarily extend to nonequilibrium 
situations. For example, both the Boltzmann and Gibbs entropies are defined for equi¬ 
librium conditions and can be represented in terms of equilibrium probabilities. But 
the Shannon entropy Sj = — (t) In P t (t) is constructed to be more general and 
might conceivably be employed for situations where the probabilities P, evolve in time. 
This does not guarantee that S I matches the properties of the nonequilibrium entropy that 
appears in classical thermodynamics, of course. First we must consider how probabilities 
evolve, and what this might mean. 

15.2.1 Probability Flow and the Principle of Equal a Priori Probabilities 

If probabilities are a distillation of our best judgement of the likelihood that a system 
might be found in a given microscopic state, then they might naturally be time dependent 
in certain situations, for example, in the period just after the acquisition of information. 
The time dependence should then reflect, at least in some way, the microscopic dynamics 
of the system, but it is easy to show that this introduces a problem. 

Consider a system of three oscillators with a familiar triangular phase space as shown 
in Figure 15.5, evolving according to a known set of microscopic dynamical laws, and 
starting from a known initial probability distribution over the phase space. For example, 
let us take the system at t — 0 to have a probability 1/3 of being in each of the microstates 
at the vertices of the triangle, indicated by the columns. The initial Shannon entropy 
is — k InP, = (—k( 1/3) ln(l/3)) x 3 = k ln3. But since we know the dynamical 



initial state known uncertain 


Figure 15.5 The transport of probability as time progresses: either rigidly and deterministically 
according to Liouville’s theorem if the rules of dynamics are known, or stochastically, with some 
spreading out, if they are uncertain. 
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rules, at a later time we remain just as uncertain as to the identity of the occupied 
microstate: it could be the end point of a trajectory starting from any of the three possible 
initial states. This situation is shown as the first choice of outcome in Figure 15.5: the 
probabilities have been shifted to three new microstates, and the Shannon entropy remains 
equal to k In 3. 

The point is that such a model cannot provide us with a working representation of 
nonequilibrium thermodynamics: the system is not initially in equilibrium (since the 
initial P, do not satisfy the principle of equal a priori probabilities) and yet its entropy 
remains constant as time progresses; it does not seem to be able to relax such that the 
microstate probabilities equalise. In systems that evolve according to known laws of 
motion, the microstate probabilities are transported along trajectories in phase space, 
a result known as Liouville’s theorem, and the Shannon entropy is conserved. This 
difficulty was known to Gibbs and various interpretations have been sought. 

If we are to employ Shannon entropy in general discussions of entropy change, and 
by implication the Gibbs and Boltzmann forms for equilibrium situations, then the con¬ 
cession we must make is the insistence that the dynamical rules determining trajectories 
in the system phase space are known. If we are not sure where a trajectory starting 
at a particular microstate will end up after a certain time has elapsed, then Liouville’s 
theorem is invalid and the Shannon entropy can change. The rigid transport of proba¬ 
bility shown in Figure 15.5 would be replaced by the second option: a spreading out of 
the probability amongst further microstates. 

What can such a situation mean? It could be that we actually do not know the 
dynamical rules exactly, and the residual uncertainty in parameters or even in the form 
of the equations of motion can give rise to uncertainty in destination. But leaving aside 
these problems, it might instead be the case that the microstates illustrated in Figure 15.5 
are not sufficiently fine in scale to allow us to state with precision where the trajectories 
emanating from them might lead. They might actually represent a collection of finer- 
scale microstates, each of which would give rise to a different trajectory, and we cannot 
predict which one is actually followed. This would require the trajectories to diverge 
from one another, which does not always happen, but arguably is a natural state of 
affairs in a complex system: it is an aspect of chaotic dynamics. 

As an extension to this, if a system with fine grained microstates is coupled to a 
reservoir that is by definition coarse grained, then system trajectories are not exactly 
predictable because the initial microstate of the reservoir is not known, and the dynamical 
rules describing the system-reservoir interactions would reflect this. A distinguishing 
feature of models that can lead to the growth of uncertainty is that the dynamical rules 
involve macrostate as well as microstate variables. 

We can therefore make progress by accepting that phase space trajectory dynamics 
might be uncertain rather than deterministic, owing to deficiencies in our microscopic 
perception. We shall develop models of uncertain, or stochastic dynamics in Chapter 16, 
but for now, we surmise that such a model might give rise to equal a priori probabilities, 
or something similar to them, in an isolated system, as it would have the effect of 
spreading probabilities out across phase space. 

As we saw in Section 8.3, equal microstate probabilities give rise to a global maximum 
in Shannon entropy. Shannon entropy is a measure of the uncertainty of a probability 
distribution, and an appealing intuitive view is that the dynamics naturally increase 
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this measure because of inherent dynamical uncertainties, until it can be increased no 
longer. The principle of equal a priori probabilities for isolated systems would then be 
a consequence of the dynamics. We take this further in the next section. 


15.2.2 The Dynamical Basis of the Principle of Entropy Maximisation 

In this section we consider in general terms how the evolution of probabilities and a 
natural increase in uncertainty might account for the procedure where we perform a con¬ 
strained maximisation of the Shannon entropy in order to identify a state of equilibrium. 

We imagine that the dynamics are such that the total Shannon entropy of a world 
described in part by macroscopic variables should increase with time. We then divide 
the world into a system and its environment and suppose that the total entropy is a sum 
of a system entropy that we represent using the Shannon expression in terms of time- 
dependent microstate probabilities P;(t), and a reservoir entropy .S', that also depends on 
time as a result of mean energy or particle exchanges with the system. 

Incremental changes in the reservoir entropy satisfy the fundamental relation (2.49) 
or (8.21), which we write in the form 

W = d(£ r > - p r d(N r ) + J2f Xj dxj , (15.28) 

i 


where the brackets represent averages over the state of the reservoir, which is presumed 
to remain in quasi-equilibrium throughout. The last contribution is related to changes in 
parameters Xj such as volume. For now, we ignore these, and we also ignore contributions 
due to particle exchange. Then the rate of change of total nonequilibrium entropy is 


dS tot _ AS, dS r _ d.S> | 1 d(E r ) 

dt dr dr dr 7j. dr 


d S, 1 d (E) 

~dt~ ~ ¥ r dr 


d 

dr 


(Sj - T~ l {E)), (15.29) 


where ( E) = E, P ,, noting that the total energy E + E r is fixed, such that the rates of 
change of the average system and reservoir energies are equal and opposite. 

The evolution of S tot or S, — 7j.' 1 (E) towards a maximum as a result of the dynamics 
of the Pj is equivalent to the maximisation of — In P, — 'tJ^2 i E i P i over the P, 

subject to the normalisation condition = 1, where X' = T~ 1 = kfi\ in other words, 

it is identical to the axiomatic procedure for identifying the equilibrium probabilities 
for the canonical ensemble discussed in Section 8.3. If we were to retain the term in 
(15.28) corresponding to particle exchange, we would recover the Shannon procedure 
for deriving equilibrium probabilities for the grand canonical ensemble as well. 

The constrained maximisation of Shannon entropy, and the equilibrium microstate 
probabilities used in the various ensembles of statistical thermodynamics, would then 
simply be a reflection of the increase in total uncertainty embodied in S tot , which would 
be a consequence of the dynamics of probabilities, a topic that we address next. 
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Exercises 

15.1 Calculate the total entropy produced in the relaxation discussed in Section 15.1.2. 

15.2 Solve the heat equation to obtain the steady state temperature profile for a copper 
rod of length 1 m and cross-sectional area 1 cm 2 with its ends maintained at tem¬ 
peratures of 0°C and 40 °C, and determine the rate of production of entropy if the 
thermal conductivity of copper is 385 Wm _l K 1 . 

15.3 Consider a classical oscillator with known initial position and velocity. Sketch the 
probability distribution over its (x , v) phase space at a later time if left isolated. 
Does its entropy increase? 


16 

The Dynamics of Probability 


In this chapter we discuss mathematical models that attempt to 
represent the evolution of uncertainty, that is to say the time depen¬ 
dence of the probabilities of occupation of microstates of a system. 

These matters are not often covered in an undergraduate text on sta¬ 
tistical physics, but are central to an understanding of the topic. The 
material can seem forbidding, hence a hazard warning sign! 

16.1 The Discrete Random Walk 

We start with a simple case: the symmetric discrete random walk i 
initially at the origin, and every timestep of length r it makes a move through a distance 
a, but to the left or right with equal probability. We presume that there are underlying 
deterministic dynamical rules that produce such steps, but that we lack the details of 
their operation, and so instead we employ a stochastic scheme that captures the same 
events through the explicit incorporation of randomness. If the probabilities of a step to 
left or right were different, the scheme would be called an asymmetric random walk. 

Notice the ‘Markov’ property of this rule for updating the configuration: the position 
after the (n + 1 )th timestep is determined only from information about the situation after 
the nth timestep. We could make the update rule depend on information further back in 
time, and such a scheme with an extended memory is called ‘non-Markovian’. However, 
we shall limit ourselves to Markovian dynamics. 

We can readily generate a set of realisations of a symmetric random walk by the 
selection of steps through repeatedly tossing a coin, as illustrated in Figure 16.1. After 
n steps (we have chosen n = 5), the particle will lie somewhere in the range from —na 
to +na. The probability P n (x m ) of ending up at position x m = ma after time n r is 
the probability of taking (n + m)/2 steps to the right and in — m)/2 steps to the left, 
summed over all possible sequences. One of several such paths that reach the point 
m = — 1 is illustrated, and an exhaustive search yields nine others. 



Caution: Entropy 


1-d. A particle is 


Statistical Physics: An Entropic Approach, First Edition. Ian Ford. 

©2013 John Wiley & Sons, Ltd. Published 2013 by John Wiley & Sons, Ltd. 




226 Statistical Physics: An Entropic Approach 


a space 



Figure 16.1 Three realisations of the symmetric random walk in 1-d, showing the unique 
ways to reach x = ±5n after five steps starting from the origin, and one of the ten paths that 
end up at x = —a. Pascal’s triangle on the right illustrates the computation of the numbers of 
such paths. 

In fact we can determine the number of paths leading to each destination by con¬ 
structing Pascal’s triangle, where numbers in each row are obtained by summing the 
two numbers positioned diagonally above, as demonstrated in Figure 16.1. This reflects 
the process of generating a path by making n appropriate 50:50 decisions, or equiva¬ 
lently it follows from the fact that each point can be reached from one of two points at 
the preceding timestep. The probability of reaching a destination is the relevant number 
in Pascal’s triangle multiplied by the probability of each individual path, which is (^) , 
and this determines the shape of the probability distribution over the phase space of 
particle positions, and hence the various statistical properties of the walk. In general, the 
fth moment of the probability distribution after step n is defined as 

+oo 

(X ( )n = E d 6 ' 1 ) 

m=—o o 

The i — 1 moment is the mean displacement while the 1 = 2 moment is the mean square 
displacement. 

If we cannot list every possible realisation of a random walk, we can instead numeri¬ 
cally generate as many as might be necessary to give us sufficiently accurate frequencies 
of various outcomes, and then use these as estimates of the probability distribution 
resulting from the dynamics. Such simulation is called ‘Monte Carlo’: each realisation 
is generated randomly, just as the motion of a ball on the roulette wheel is supposed to 
be controlled by chance. 

However, there are ways to calculate evolving probability distributions such as P n (x m ) 
for the symmetric random walk in 1-d, and in the next section, we develop a powerful 
technique. 

16.2 Master Equations 

We solved the symmetric random walk problem by explicitly counting trajectories 
emanating from an initial point into a phase space. There is an equivalent point of 
view where we imagine the propagation of probability into the phase space, such that 
the probability distribution evolves as time progresses. The rules of propagation are based 
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on the multiplicative and additive laws of probability, or intuitively on the redistribution 
of probability brought about by the available decision processes. 

For example, in the case of a symmetric random walk, probability is split into two 
whenever a choice of step is made. The probability that the particle is at position x m 
at timestep n + 1 is therefore one half the probability that it was located at x m _j at 
timestep n, plus one half the probability that it was at x m+1 . In short, the probability 
evolves according to the rule 

P n+l( x m) = l 2 P n( x m-0 + \ P ni X m+ 1 )> ( 16 - 2 ) 

which conveys an idea of probability as a property that is divided as decisions are made, 
but accumulated if the outcomes are the same. On reflection, this is precisely the way 
Pascal’s triangle works. 

More generally, if there are many possible transitions in the previous timestep that 
have the effect of bringing a particle to the position x m , we would write 

OO 

P n+] (x m ) = T( - X m -vlv) f „(v). ( 16 -3) 

m '=—oo 

where T(Ax \x) is the transition probability for making a step of size Ax given a start¬ 
ing position of x. In principle, this could be time dependent. Note that the scheme is 
Markovian: the probability distribution at timestep n + 1 depends only on the proba¬ 
bility distribution at timestep n, and on transition probabilities that have no memory of 
previous times. Equation (16.3) is called a master equation. The origin of the name is 
a little obscure, but it is appropriate because it provides the basic framework for the 
dynamics of probabilities. 

For the symmetric random walk, there are only two nonzero transition probabilities, 
describing steps of length a to left or right, and they are both equal to 1 /2. The transition 
probability is normalised such that 

OO 

T ( x m- x m'\ x m') = 1 ’ ( 16 - 4 > 

m=—oo 

since there is a probability of unity that some transition is made starting from position 

x m' ■ 

The probability of reaching position m at time n +1 is a sum of probabilities of all 
possible paths that lead to this point. In the last section we stated that the probability 
of a particular path of n steps is equal to (1/2)". More formally, this path probability 
is a product of transition probabilities for each step taken along the path, multiplied 
by the probability of starting at the appropriate initial position. We shall consider this 
representation of path probabilities again in Chapter 17. 

For the random walk, there are only two possible transitions starting at x m i and we 
write 

T(x m - x m ,\x m ,) = j(5 m _! m , + 8 m+l „,,), (16.5) 

where we employ the Kronecker delta which is unity if the two indices are equal, 
but zero otherwise. The two terms in the brackets represent, respectively, steps to the 
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m 


Figure 16.2 Probability distribution for the symmetric random walk in 1 -d after five timesteps, 
starting from m = 0. 


right (m = in' + 1) and to the left On = m' — 1). Hence, as proposed earlier on intuitive 
grounds, the master equation for this system is 

°°' i j [ 

P n+l( X n,)= E 2 (3 '"- lm ' +3m + 1 '"') P »( X '"') = 0 P " (X '"- l)+ + (16 ' 6 ) 

m'=—oo 

since the properties of the Kronecker delta are such that '^ZjSjjFj = Fj. 

16.2.1 Solution to the Random Walk 

We can often employ a discrete Fourier transform to solve master equations where the 
probability distribution is defined for positions x m with positive and negative integer m. 
We define the characteristic function 

OO 

G n (k)= E e^PM, (16.7) 

m =—oo 

such that by multiplying by e' kx,n and summing, the master equation (16.6) can be 
turned into 

OO | OO - oo 

J2 &ikxmp n+ item) =2 j k(Xm -' +a) P n {X m -l)+ 2 E Z MXm+l ~ a) Pn(x m+ l), 

m=—oo m=—oo m =—oo 

(16.8) 

where we note that x m _ ] + a = x m = x m+l — a. In the first sum on the right hand side, 
we define a new summation index M = m — 1 and in the second we define M’ = m + 1. 
The bounds on the sums are unchanged, and we get 

OO . OO 1 OO 

E e * XmP "+!(*»)= 2 E z' k(XM+a>P n( x M)+ 2 E P n( x M'h 

m =—oo “ M=-oo M'=—o o 

(16.9) 

which clearly corresponds to 

G n+l (k) = \e ika G„(k) + W' ka G n (k) = cos (ka)G n (k). 


( 16 . 10 ) 
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Since the particle is at the origin at t — 0, we have P 0 (x m ) — S m0 such that G 0 (k) — 1 
from (16.7). By iteration, therefore, we have 

G n (k) = cos" (far), (16.11) 


and all we have to do now is invert the discrete Fourier transform to obtain P n (x m ). The 
simplest procedure is to use the binomial expansion to write 


G n (k) = (p + q) n =J2 p" V 

1 =0 


n ! 

l\(n-l)V 


with p — ^e lka and q = lka . Writing m = n — 21, this becomes 




" ■ ifartn-D-H 

(^) ! (^) ! 


(16.12) 


(16.13) 


noting that m is spaced in increments of two. Note also that {n — l) — l is equal to m 
and that ma — x m . Hence 


GJX) = 



n ! 


/ n—m \ | / n-\~m \ j 
V 2 ) • V 2 ) ‘ 


Jkx m 


(16.14) 


and by referring to the definition of the characteristic function (16.7), we read off the 
coefficients of c' kx "' to extract the probabilities: 


P nX m ) 


1 


2 n ^ n—m ^ | ^ n+m ^ | 


(16.15) 


for | m\ < n and even in — m), and P n (x m ) = 0 otherwise. This distribution is shown in 
Figure 16.2 for n = 5. This is entirely consistent with the numbers in Pascal’s triangle 
that were used earlier to generate the probability distribution by explicitly counting paths. 


16.2.2 Entropy Production during a Random Walk 

The probabilities of walking to positions x m after time n r allow us to determine the time 
dependence of the Shannon entropy for this process. It is clearly given by 


S, = -k^Pn(x m )\nP n (x m ) 

m 
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(16.16) 


and it is equal to zero at n — 0, corresponding to complete certainty in location, and 
increases with time as uncertainty develops, as shown in Figure 16.3. 

If the space on which the walk takes place is infinite, the entropy continues to grow, 
but if the space is a ring, such that a move to the right from x = Ma takes the particle to 
x = —Ma, and vice versa, then intuitively all the probabilities will evolve towards a 
constant equal to (2 M + 1) 1 . The Shannon entropy then increases towards an asymp¬ 
totic value of k ln(2 M + 1). What we find in such a case is an analogue of the free 
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Figure 16.3 Shannon entropy after n steps of a 1-d random walk. 


expansion of an isolated, initially constrained system into a larger phase space, with 
dynamics that generate a final equilibrium state with equal microstate probabilities. The 
Shannon entropy increases throughout the expansion until it reaches a value given by 
Boltzmann’s expression. Dynamics such as these can clearly form a basis for developing 
a treatment of nonequilibrium processes. 


16.3 The Continuous Random Walk and the Fokker-Planck 
Equation 

We have discussed the evolving probabilities on a discrete phase space for a particle 
undergoing a random walk, and it is also of interest to consider the evolution of prob¬ 
ability density functions in continuous time over a continuum of positions. This would 
give us insight into the change in statistical properties of the classical kinetic energy of 
a particle as it is heated, for example. We shall approach this by deriving the appropriate 
equations that describe a continuous random walk, where the spatial step and the time 
step both become infinitesimal. 

It is convenient to define the probability density function p(x, t ) such that p(x,t)dx is 
the probability that the particle lies in the region x — (l/2)dv — > x + (1/2)dv. We also 
define a Markovian transition probability density T(x — x'\x') such that T(x — x'\x')dx r 
is the probability that a transition through displacement Ax = x — x' is made from a 
point lying in the region x' ± dr'/2 in a period r starting from time t. An integral form 
of the master equation (16.3) for the discrete case may then be written as 

/ OO poo 

p(x\ t)T(x — x'\x', t)Ax' = / p(x — Ax, t)T(A.x\x — Ax, f)dAx, 

-OO J —OO 

(16.17) 

which is often called the Chapman-Kolmogorov equation. Note that T has 
dimensions of inverse length and is normalised according to / T(x — x'\x', t)dx' = 
f 7'(Ax|x,f)dAx = 1, the analogue of (16.4). 
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We can turn this integral equation into a differential equation by performing a Taylor 
expansion of the integrand: 


n= 1 


9x" 


p(x — A x,t)T (Ax|x — A x,t) = p(x,t)T (Ax| x,t) 

™ 1 d n (p(x,t)T(Ax\x,t)) 

such that (16.17) becomes 
p(x,t + r) - p(x,t) = 

Now we define 


r 00 l 

] dAx J2^~ Ax ^ 


n= 1 


n d n {p(x,t)T (Ax|x,r)) 
dx n 


i r°° 

M n (x,t)—— / dAx(Ax)"7'(Ax|x,r), 

r oo 


in which case we can write 


1 , , . , ^ ^ ^(- 1 )" r{M n (x,t)p(x,t)) 

-(p(x,t + St)-p(x,t)) = > --- 

r L ' n ’ 


n =1 


dx' 1 


(16.18) 


(16.19) 


(16.20) 


(16.21) 


This is an infinite order differential-difference equation describing the evolution of the 
probability density function p, given an underlying transition probability density T. It is 
often called the Kramers-Moyal equation. 

We now consider a random walk in the limit of continuous space and time. At each 
point in the walk, defined by position x and time t, there is a choice of a step to the right 
of length a + u, or a step to the left of length a — u, (with a > u), both with probability 
1/2. The transition probability density is taken to be independent of time and position 
and we can write 


T (Ax\x,t) = T (Ax) = i[<5(Ax — (a + u )) + c5(Ax + (a — u))], (16.22) 

using the Dirac delta function, defined to be infinite where its argument vanishes, and zero 
everywhere else, and satisfying the important normalisation condition J 5(y)dy = 1 
and sifting condition f™ O0 f(y)8(y — y 0 )dy =/(Jo)■ The transition probability density 
(16.22) specifies that transitions of precisely Ax = (a + u ) and Ax = —(a — n) are 
allowed, and no others. We then evaluate 


. 1 1 u 

M l = — I dAx Ax-[<5(Ax — a — it) + <5(Ax + a — m)] = — (a + u — a + u) = — 


II ■ 

-hi 


i 


9 9 

a + u 


M 2 — — I dAx (Ax) -[<5(Ax — «—«) + S (Ax + a — m)] = 


M 3 = 


u 


3 1 

dAx (Ax) -[<5(Ax — a — u) + <5(Ax + a — m)] = 


3« 2 n + u 3 


(16.23) 
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-hi' 


M 4 = — I dAx (Ax) -[3(Ax — a — u) + 3(Ax + a — w)] = 


a 4 + 6a 2 u 2 + m 4 


and so on. 

In order to model the continuous random walk, we allow a, u and r to go to zero 
such that lim(M/r), and lim(« 2 /r) are both finite, in which case only M ] and M-, are 
retained while the other M n vanish. Furthermore, the difference on the left hand side of 
(16.21) becomes a time derivative and we arrive at 


3 p(x,t) 3 (M { p(x,t)) | 13 2 (M 2 p(x,t)) 

dt ~~ dx + 2 dx 2 


(16.24) 


If we generalise to a walk where the parameters a and u describing the step lengths 
depend on x and t, the M 1 and M 2 also become functions of position and time. This 
general result is known as the Fokker-Planck equation: a second order partial differential 
equation describing the evolution of a probability density, in this case for a continuous 
version of the 1-d random walk. 


16.3.1 Wiener Process 


The limit of the discrete 1-d random walk with equal step lengths to left and right, that 
is with u — 0, and with an a that is independent of x and t, is called the Wiener process. 
The probability density p(x, t ) evolves according to (16.24) with M x = 0 and M 2 = 2D, 
where D is a constant. The Fokker-Planck equation then takes the form of the diffusion 
equation previously used to describe particle transport (15.22): 


3 P(x,t) _ D d 2 p(x,t) 


3 1 


dx 2 


(16.25) 


We can solve the diffusion equation using a continuum version of the method employed 
in the discrete random walk case. We define the characteristic function 


such that 


G(k,t) = f 

J —cx 

i r 

,t) ~2nj_ c 


p(x,t) = — 


p(x, f)e lfa ck, 


G(M)e~ to dk, 


and we Fourier transform both sides of the diffusion equation to get 

r-OO a 2. 


/ 

J — o 


—e ifa dx = D 
dt 


/ 

J —c 


a-p(.r,0 ifa 

dx 2 


(16.26) 


(16.27) 


(16.28) 


Now we perform two integrations by parts, and assume that p and dp /dx go to zero as 
x —> ±oo, such that 

r-OO a 2 v 


/ 

J —C 


3 = 
dx z 


dp(x,t) Jkx 

dx 


-*r 

J— o 


dx 


—ik ^p(x, t)e lfct — i kj p(x, f)e lfcl d.\^ = — k 2 G(k, t). 


(16.29) 















The Dynamics of Probability 233 



x 


Figure 16.4 Evolving Gaussian probability density function characterising a Wiener process. 


and hence (16.28) becomes 


3 G 


- k 2 DG. 


(16.30) 


Since the particle starts at the origin the initial condition is p(x, 0) = 5Or), such that 
G ( k, 0) = 1, and hence we can integrate (16.30) to obtain 

G(k,t) = exp(-£ 2 Z5r). (16.31) 

The inverse Fourier transform can now be performed: 


p(x,t) 


1 

2jt 

1 

2 it 



exp(— k 2 Dt — ikx)dk 


exp 




(16.32) 


and by relabelling the integration variable, shifting the integration contour in the 
complex plane, and using the Gaussian integral f_ exp(— otx 2 )dx = (it/a ) 1 / 2 we 
arrive at the solution ^ 

p w (x,t) = - 1 —F exp ( -7777 ) ■ (16.33) 

(4it Z3r) 2 V 4 Dt) 

The result is a Gaussian with a mean of zero and a variance 2 Dt, and the suffix W 
reminds us that it is a description of the Wiener process. This is the counterpart of 
the probability distribution of the discrete random walk we considered in (16.15). An 
illustration of the evolution of the distribution is given in Figure 16.4. 


16.3.2 Entropy Production in the Wiener Process 

We studied the evolution of Shannon entropy in the discrete random walk and we might 
expect to find similar behaviour in the case of the Wiener process. Taking the continuum 
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limit of the expression (16.16), we would seem to require 

Sj(t) = —k lim 8x p w (x,t)\n(p w (x,t)8x). (16.34) 

< 5*—>0 ^' 


We here encounter a difficulty that has been present ever since we defined the Gibbs 
entropy, and its Shannon generalisation, in Chapter 8. How do we define the entropy of 
a continuous probability density function? If we insert (16.33), we obtain 

S { (t) = -k 

—k lim } 8x p w (x,t) ln(<5r) 

Sx-+ o ' 

= -(r 2 ) — k{\n((4st.Dty^\) — k lim Y^ 8x p w (x,t)]n(8x) 

\Dt \ / Sx-tO*— 1 

k k 

= - + - ln ( 4ltD? ) - £(ln<5r>. (16.35) 

The problem with this result is that the last term diverges to oo when we approach a 
continuum limit. This will not do. 

The difficulty is somewhat avoided if we consider instead the difference in Shannon 
entropies at two times t j and t 2 > t l : using a compact notation, we write for a general pdf 


f 


dx p w (x,t ) 


4 Dt 


In 


((47t£)r)-2) 


AS! = 




dx p(x, t 2 ) In (p(x, t 2 )8x) 


~ k f 


dr p(x,t x )\n(p(x,t x )8x) 


f 


dx p(x, t 2 ) \np(x, t 2 ) + k I dr p(x, t { ) \np(x, ty). 


(16.36) 


reducing for the Wiener process to 


k k k 

ASj = — ln(4jt Dt 2 ) — — ln(4it Dfj) = — In 


(16.37) 


which steadily increases with t 2 — t x . This resembles the behaviour of S, for the discrete 
walk shown in Figure 16.3. 

It makes sense to define the entropy of a system described by a pdf over a continuous 
phase space using the expression 


Yy S (0 = ~k 



dr p(x,t ) In 


V Pref J 


(16.38) 


having inserted a constant probability density p rcf to ensure that the argument of the 
logarithm is dimensionless. The difference in values of S sys for two pdfs p l (x,t) and 
p 2 (x,t) is identical to the difference in S 2 for the same cases, but S sys is mathematically 
better behaved than Sj. 

We can test these ideas for the case of a single spin zero particle in a 3-d box of volume 
V. We expect that p(x,t) = 1 /V in equilibrium, in which case S sys = k ln(/? ref F). We 
know that the correct entropy for such a system in canonical equilibrium at temperature 
T, calculated from quantum mechanics in Section 9.3, is S G = T~ l (E) + k In Z, = + 
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k In (n V) = k In (e?/ 2 n V) and so for consistency we should choose p ref = e 3 / 2 n„, where 
n q (T) is the quantum concentration. Just as we determined the parameter h 0 in the 
partition function over a continuous phase space in Section 14.1, by making reference to 
a quantum mechanical version of the same problem where the phase space is discrete, 
we can similarly determine the correct form of the entropy for continuous systems. 

16.4 Brownian Motion 

We shall use the model of a continuous random walk to understand some aspects of 
the phenomenon of Brownian motion. In 1827, the botanist Robert Brown (1773-1858) 
described the jiggling motion, when suspended in water, of tiny particles of a few 
tens of microns in diameter found inside pollen grains. Later in the century, with the 
emergence of kinetic gas theory, attempts were made to ascribe this motion to the 
fluctuating bombardment of the particles by the surrounding molecules. However, a 
predictive model based on this assumption was not available until Einstein in 1905 
adopted a statistical approach. He abandoned any attempt to describe the microscopic 
bombardment in detail and just introduced an unknown transition probability density T 
describing shifts in particle position Ax over a time interval r. The first and second 
moments of the transition probability density (divided by x) correspond to and M 2 
in the Fokker-Planck equation. 

We can interpret M y — (Ax)/r as a mean or drift velocity v D , and (1/2)M 2 = 
((A.r) 2 )/(2r) may again be written as a diffusion coefficient D, both assumed to be 
constants. For Brownian motion in 1-d, the pdf should therefore satisfy 



(16.39) 


and if a particle is released from a position x Q , the initial condition is p(x, 0) = S(x — x 0 ). 

The remaining task is to relate this behaviour to other phenomena. Einstein assumed 
that the steady state probability density function for the particle position in a gravitational 
field was the canonical equilibrium pdf previously introduced by Gibbs and Boltzmann, 
and derived in Section 6.1.3, namely p(x, oo) oc exp (—mgx/kT), where g is gravitational 
acceleration. Quite clearly, the particle is in thermal contact with its environment through 
the jiggling it receives from the surrounding molecules, and T is the temperature of that 
environment. Inserting this into (16.39), Einstein found that he could relate D to the drift 
velocity v D and the prevailing temperature: 



D = — v D . (16.40) 

mg 


Einstein then chose the drift velocity to be equal to the settling velocity of a small sphere 
of mass m and radius r under gravity, which is calculable from classical mechanics. 
For micron size spherical particles and low velocities such that the fluid flow pattern 
is laminar, the downward force mg is balanced by a force av D , where a is the drag 
coefficient given by Stokes’ law, a = 6it rp v , and /z v is the viscosity of the host fluid. 
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Figure 1 6.5 Illustration of Brownian motion in a gravitational field. A particle follows a compli¬ 
cated path that is difficult to predict in detail and is best represented stochastically. The outcome is 
an evolving pdf of particle height x from an arbitrary initial distribution towards a final Boltzmann 
distribution. Downward drift under gravity is eventually balanced by upward diffusion brought 
about by the density gradient. 

Hence we obtain an expression for the diffusion coefficient for Brownian motion: 



kT 


(16.41) 



a 


which is known as the Einstein relation. Such a relationship between the particle diffusion 
rate, its size and the properties of the host medium was confirmed by Perrin in 1909. 

It is sometimes claimed that Einstein’s work confirmed that molecules exist. In fact it 
did this only indirectly. The hypothesis that molecular bombardment jiggled the particles 
had been around for some years, but quantifying the effect was too difficult. The main 
thrust of Einstein’s work was to reduce the complex dynamics down to something very 
straightforward: stochastic particle motion characterised by a diffusion equation with 
drift. This captured the essence of the dynamics and then by making the behaviour 
consistent with other effects, the diffusion rate could be deduced. This is a key strategy 
in statistical physics. 

Thus, Einstein seemed to confirm that a cloud of suspended particles large enough to be 
visible under a microscope will relax towards a canonical probability distribution function 
that is exponential with height, as sketched in Figure 16.5. The visible motion of the 
particles could be viewed as a scaled up, and slowed down, version of the kinetic turmoil 
operating at the truly molecular level. This work provided strong support for Boltzmann’s 
then not universally accepted view that matter was composed of particles and that their 
dynamics could be described statistically. Unfortunately, these developments came too 
late for Boltzmann, who took his own life in 1906. 


16.5 Transition Probability Density for a Harmonic Oscillator 


We shall discuss one further case of probability dynamics, describing the Brownian 
motion of a particle tethered to a point by a harmonic force with a spring constant 
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Figure 16.6 1-d Brownian motion in a harmonic well. The pdf is given by the 

Ornstein-Uhlenbeck expression (16.44) at three different times. The mean position drifts from 
the release point towards the tether point, and the variance of position evolves towards that of the 
canonical distribution as t —> oo. An example of a realisation of the motion is shown. 


k, as illustrated in Figure 16.6. We proceed heuristically as we did in the last section 
by identifying the ^-dependent drift velocity through a balance between the harmonic 
force and a drag force. Therefore we write av D (x) = —kx and the Fokker-Planck 
equation as 


3 p(x,t) k d(xp(x,t)) d 2 p(x,t) 

d t a dx + dx 2 


3 

dx 



xp(x,t ) + D 


dp(x,t) \ 
dx ) 


(16.42) 


We expect the canonical equilibrium pdf of particle position to be p(x,oo) oc 
exp (—xx 2 /(2kT)), as derived in Section 6.1.2, and since this satisfies 

K dp(x.oo) K KX 

—xp(x, oo) + D ---= —xp(x, oo) — D — p(x,o o) = 0, (16.43) 

a dx a kT 

using (16.41), it also satisfies (16.42). 

If the particle is released at x — x 0 at t = 0, we would need to solve (16.42) subject 
to the initial condition p {x, 0) = <5(x — x 0 ) in order to understand the time dependence 
of the tethered Brownian motion. Unfortunately, this is a rather lengthy derivation, and 
so we instead simply quote the solution: 


PouC*’ ? ) 


1 - lict eX P 

2 ttUT(1 — e « ) 


:(x -x 0 e 

(l-e-^) 


2kT (1 


(16.44) 


which can be checked by explicit insertion into (16.42). 

This should be viewed as a transition probability density, as it is a solution conditional 
on a definite position x 0 at the initial time. It is a Gaussian distribution with time- 
dependent mean and variance, and its evolution is sketched in Figure 16.6. It satisfies 
our intuitive expectation that the distribution spreads out from the release point, but in 
contrast to the Wiener process, it does not continue to spread forever, because of the 
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tethering of the particle to the origin. The label OU stands for Ornstein-Uhlenbeck, 
which is the technical name given to Brownian motion in a harmonic potential. We shall 
employ this transition probability density in Chapter 17. 


Exercises 


16.1 A particle starts at the origin and at each timestep r either remains in position, 
or with a constant probability X moves to the right through a distance a. Write 
down the master equation describing the evolution of the probability P n (x m ) that 
the particle should be found at position x m = ma at time t = nr. Solve this master 
equation to derive P n (x m ). Plot the distribution at n = 8 for X = 0.2. Calculate the 
mean and standard deviation of the position label m at n = 8. 

16.2 In the ‘Ehrenfest Urn’ problem, a particle moves randomly on a grid of posi¬ 
tions x = ma with m an integer in the range —L < m < L, and with timestep 
r. The probability, when at position m', of a step to the right in' -* in' + 1 is 
T + = i ( l — and the probability of a step to the left in' —> in' — 1 is T_ = 

j (l + j . Evaluate the coefficients of the Kramers-Moyal equation for this 
process. Take the continuum limits a —> 0, r->0, L —oo such that a 2 /r —> 2D 
and La 2 —> 2a 2 , where D and a are constants, to show that the Fokker-Planck 
equation describing the evolution of the pdf p(x,t ) is 

9p _ D d(xp) D d2 P 

dt a 2 dx dx 2 

Verify by substitution that p(x) = (2 tmt 2 )~~ 1//2 exp (— x 2 /2o 2 ) is the time- 
independent solution to this equation. 

16.3 Mr and Mrs Ehrenfest keep N rabbits, and house them in two rabbit hutches, one 
blue and one pink. Every morning they select one of the rabbits at random and 
move it to the other hutch. The probability that on day n there are m rabbits in the 
pink hutch is P n (m), with 0 < m < N. Show that the master equations describing 
the process are 

in + 1 N — m + 1 

P n +i(m) = — P n (m + 1) +--- P n (m - 1), 

except for m = 0 and m — N , for which P n+ 1 (0) = (l/N)P n (l) and P n+1 (N) = 
(1 /N)P„(N — 1). Verify that the following time-independent probability distribu¬ 
tion satisfies the master equations 


P n (m ) = P{m) 


1 N ! 

2 n m\(N — m)\ 


After some time, the Ehrenfests find that the number of rabbits has increased. 
Express P(m ) in terms of /i = m — ^N and show that if /V 1 and /i « V/2 
then the distribution may be approximated by 



P(p) oc exp 
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and show that the variance a 2 is N/4. You might need to use the following 
approximations: ln£! k ln£ — k for k 1 and ln(l + x) x — \x 2 for |x| 1. 

16.4 A set of radioactive atoms decays with time. The process may be modelled approx¬ 
imately using the Fokker-Planck equation 

3 p(n,t) ^ d(np{n,t )) | A. 3 2 {np(n,t)) 
dt = dn + 2 dn 2 ’ 

where pin, t) is the probability that n atoms remain after time t and where X 
is a constant. Using an integration by parts, derive an expression for d(n)/dt, 
where the mean population is given by (n) = f Q np(n,t) dn. You may make the 
approximation that both p(n,t) and 3 p(n,t)/dn vanish at n = 0 and oo. Also 
derive an expression for d(n 2 )/df. 

16.5 Estimate the root mean square displacement of a particle of radius 1 |xm after 
one minute if it is suspended in air at room temperature. The viscosity of air is 
18 x l(T 6 Nsm- 2 . 

16.6 Sketch the time evolution of the system entropy of a harmonically tethered Brown¬ 
ian particle arising from the instantaneous halving of the spring constant, assuming 
it had previously been in equilibrium. 





17 

Fluctuation Relations 


In recent years, the field of nonequilibrium statistical thermodynam¬ 
ics has developed in ways that cast new light on the concept of 
entropy and its production, and it is the aim of this chapter to provide 
a brief introduction to ideas that are summed up under the generic 
name of fluctuation relations. This is fairly challenging stuff, hence 
another warning sign! 

17.1 Forward and Backward Path Probabilities: a Criterion 
for Equilibrium 

It is crucial to appreciate what is meant by equilibrium in statistical thermodynamics. On 
a macroscopic scale, it is quite clear that this means a state of a system where nothing 
changes or flows, but on the microscopic scale, such a concept cannot apply as particles 
are in motion, and there are fluctuations in the amount of energy and material in a 
system due to exchanges with the environment. We have stated that equilibrium on the 
microscale means that the statistical properties of the system are independent of time, 
and furthermore that there are no mean flows through the system between different parts 
of the environment. We shall extend this concept now in the following way. 

As a system evolves, sequences of events will be observed, and we might wish to 
ascribe a probability to each sequence. For example, in the one dimensional Brownian 
motion of a particle in a harmonic potential 0 = (\/2)kx 2 with a drag coefficient a, 
considered in Section 16.5, we can imagine a sequence of events consisting of an obser¬ 
vation of the particle at time t = 0 in the position range x {] ± (l/2)dx 0 , followed by its 
observation at a later time t in the range x ± (l/2)dx. The sequence, described here as a 
path, is sketched in Figure 17.1 in terms of a trajectory between two points in the x — t 
plane, although it should be borne in mind that we are not specifying exactly how the 
particle reaches its destination, so the trajectory shown is only one possibility. 
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Figure 17.1 Sketch of a forward sequence of events, or path, consisting of the observation of a 
Brownian particle at coordinates (x 0 ,0) and then (x,t), and the backward sequence corresponding 
to the opposite order. 


The probability of such a sequence of events, with its implicit inclusion of all inter¬ 
vening trajectories, is given by 

EP{x, t\ x 0 , 0) = T(x — x 0 \x 0 )p(x 0 , 0)dx o dr, (17.1) 

which is written in such a way as to resemble the notation used in the discussion of the 
random walk in Sections 16.2 and 16.3. It is a product of the probability of the initial 
event, the observation in the region of x 0 , multiplied by the probability of observation in 
the region of x given that motion began at x 0 . In fact for the system in question we can 
use the transition probability density for the Ornstein-Uhlenbeck process introduced in 
(16.44), and write 


P(x,t;x 0 , 0) = p ou (x - x o ,t)p(x o ,0)dx Q dx. (17.2) 

The key concept is that a state of equilibrium is a situation where the probabilities of 
realising a sequence of events and the exact opposite sequence are precisely the same. 
This is so important as to bear repeating. Equilibrium means that the likelihood that the 
system goes backward is the same as its likelihood of going forward. We can demonstrate 
that this is so by writing the probability of the reverse of the sequence in question as 


EP(x 0 ,t\ x,0) = Pou^o ~ x ’ 0p( x , 0)drdjc 0 , (17.3) 

and examining the ratio of these so-called forward and backward path probabilities: 
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having inserted (16.44) as well as the equilibrium canonical pdf p(x, 0) oc 
e,xp(—icx 2 /2kT). 

What this tells us is that not only does the equilibrium state have time independent 
statistical properties, such as the mean and variance of particle position, but also that any 
sequence of events, at least of the kind we have considered, is equally likely to be seen 
going forward or going backward. The ratio of path probabilities is a powerful statistical 
expression of what is meant by reversibility in a randomly evolving system. Everything 
we observe is just as likely to be seen in reverse, if the system is in an equilibrium state. 
We have illustrated this for a simple tethered Brownian particle, but the same definition 
of equilibrium can be extended to more general cases. For a system in equilibrium, we 
cannot tell statistically whether we are viewing a movie of its behaviour running forward 
or backward in time. 


17.2 Time Asymmetry of Behaviour and a Definition 
of Entropy Production 

Having established what is meant by equilibrium, it is straightforward to define what we 
mean when we say a system is out of equilibrium. For such a system, the probabilities 
of observing some sequences and their reversals are not equal. This is a deeper statistical 
meaning to the concept of the irreversibility of a process. We shall use the ratio of path 
probabilities as we did in the previous section to obtain a measure of how far the system 
might be away from equilibrium. 

The manner in which we do so requires some careful explanation. We shall consider 
a 1-d harmonic oscillator that is relaxing towards equilibrium while the parameters k , 
a and T remain constant. The statistical state of the system is represented by a time- 
dependent probability density function p(x, t ). We observe the system at t = 0, at t — At 
and again at t = 2At, defining two observational intervals of length At. We consider 
the probability of observing a path from x 0 to x in the first of these intervals, and quite 
independently, the probability of a path from x to x 0 during the second, as illustrated in 
Figure 17.2. 

If the system were in an equilibrium state with time-independent statistics, then accord¬ 
ing to the argument presented in the last section, the two probabilities would be equal, 
for all possible paths, but in general they will differ. So we define 



A Sj ( x , At; x 0 ,0) 


exp 


k 


where A s t (x, At; x () . 0) is a property that we assign to the path from x 0 to x. In contrast 
to (17.4), we consider p(x,t) to be noncanonical in form. We expect to find that the 
ratio is not unity for all sequences, or equivalently that As ; (x, At; x 0 , 0) differs from 
zero depending on the start and end positions of the path. The key task now is to look 
at the statistics of As;, and specifically to examine how it fluctuates away from zero. 

We start by evaluating the mean of the quantity exp(— As { (x, At; x 0 ,0)/k) over all 
possible paths x 0 —> x, each of which occurs with probability tP(x, At; x 0 ,0). The mean 
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Figure 17.2 Forward and backward paths defined over successive periods At when a system is 
relaxing towards equilibrium. 


is written as 


exp - 
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(17-6) 


but by inserting the definition (17.5), this can be written 
As, 


exp - 


/ OO /»oo 

-OO J — 


Pou(*c — x ’ At) p(x. At) dxcko = 1. 


(17.7) 
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This is unity since the final integral is simply a normalisation of the probability that a 
backward path is observed, starting and finishing anywhere at all. This straightforwardly 
derived but powerful result is called the integral fluctuation relation. 

We now make use of the inequality exp(z) > 1 + z that holds for any real value of 
z. This implies that the average of exp(z ) over any distribution of the variable z should 
satisfy (exp(z)) > 1 + (z) and if we insert z = —As i /k the consequence is 


(As,-) 

k 



(17.8) 


using (17.7). 

We conclude that As,, a property associated with the evolution of a tethered Brownian 
particle as it evolves over the period 0 < t < At, (and implicitly a reflection of projected 
behaviour over a further period At), is positive when averaged over all possible paths, 
unless the system initially takes a canonical equilibrium pdf over its phase space. If the 
system does take the equilibrium distribution, As, is zero for all observed behaviours, 
and so its average is zero too. This is starting to sound rather familiar! 

Let us therefore consider whether the value of As,- for a given period, averaged over 
all possible system behaviour under the prescribed stochastic or random dynamics, and 
starting from an initial situation specified by a given pdf over position, might correspond 
to the associated internal production of entropy, namely AS, = (As,). We need to check 
this assertion explicitly. 
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17.3 The Relaxing Harmonic Oscillator 


A relaxation process involving a 1-d classical harmonic oscillator in thermal contact with 
a heat bath is governed by the Fokker-Planck equation 

dp(x,t) _ k d(xp(x,t)) kT d 2 p(x,t) 

dt a dx a dx 2 

as in (16.42), where k is the spring constant and a the drag coefficient. This equation 
applies for ‘overdamped’ conditions, where the drag force on the particle is strong, and 
where we need not concern ourselves with deviations from the Maxwell-Boltzmann 
distribution of particle velocities during the relaxation, but can focus instead on changes 
in the distribution over position. 

We consider the evolution of p(x,t) starting from the initial condition 


p(x, 0) 



(17.10) 


with k q ^ k. We are interested in the statistical behaviour of the quantity A s t associated 
with the relaxation of the system pdf towards equilibrium, as illustrated in Figure 17.3. 

It may be shown that the solution to the Fokker-Planck equation (17.9) with this 
initial condition takes the Gaussian form 


p{x,t) 


m 

2-nkT 


exp 


/ K(t)x 2 \ 

V 2kT ) ’ 


(17.11) 


and by inserting this into (17.9), the function k(t) can be shown to satisfy 

d/c 2 

= -~ K {k-K), (17.12) 

dr a 




Figure 17.3 Evolution of a pdf describing a Brownian particle in a harmonic potential as the 
system relaxes from an initial nonequilibrium state, characterised by a Gaussian with parameter 
k = k 0 , towards an equilibrium state where ic = k. The time dependence of the average of As ; is 
also shown. 
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with £(0) = k 0 . This is equivalent to dz/dr = 2a 1 (1 — kz), where z = k *, which may 
be solved for z (?) to give 


m = - 

K o 


e 


KqK 

“ (k ~ Ko) 


(17.13) 


which is also illustrated in Figure 17.3. 
We can then write 
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(17.14) 


and we can average this over the probability distribution of paths to get 


(A.,-) 


J Pou( x ~ x o> t)p( x 0 ’ 0) As ; ( x , t; x Q , 0) drdr 0 . 


(17.15) 


In order to proceed, we note that 


p(x,t) 


J Pou(x - x 0 ,t)p(x 0 ,Q) dr 0 , 


(17.16) 


which is the Chapman-Kolmogorov equation (16.17) for this situation, representing the 
transfer of probability to the final position x by all possible paths, so that we can write 


(x 2 ) = 

j Povix — x o> t)p(x 0 ,0)x 2 drdr 0 = 

f Hx -' )x2 dt = 

kT 

k 

(17.17) 

by insertion of (17.11), and similarly 
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since f p ol: (x —x 0 ,t) dr = 1 by normalisation. 
Hence we can write 
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(17.19) 


which is sketched in Figure 17.3. From this result we can show that 


d(As,-) 
d t 


k 1 \ dk k dir _ k (k — k) 2 

k 2 (t) k{t) J d t 2k 1 dr ^ ^ a k 


(17.20) 


using (17.12), and it is clear that the rate of change of (As,-) can never be negative. 

So does (A.v ( ) bear any resemblance to the overall change in the entropy we would 
expect for this process according to classical or statistical thermodynamics? We can check 
this by calculating the changes in nonequilibrium entropy of the system and environment 












Fluctuation Relations 247 


resulting from the process. We calculate the change in system entropy defined according 
to (16.38), namely 


A^'sys 
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dx /?(x, r) lnp(x, f) + A: dx 0 p(x 0 , 0) lnp(x 0 ,0) 
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(17.21) 


and as the work done during the process is zero, the change in reservoir entropy is 
related to the energy transfer as follows: 
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such that 


A>S'tot — AS r + A5 sys 
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(17.23) 


and just as desired, this matches (A s { ) in (17.19). 

We now draw an analogy with the example of entropy production due to the transfer 
of heat between a reservoir and an ideal gas considered in Section 2.7. In Figure 17.3 we 
illustrate the case k > k 0 , and it should be recognised that this is equivalent to a process 
of cooling the Brownian particle from an initial state at a temperature T 0 = T k/k {] > T 
towards the reservoir temperature T. This is made apparent by writing the initial pdf 
as p(x 0 , 0) oc exp(— k 0 Xq /2kT) = exp(— kx£/ 2kT 0 ), and noting that as t —> oo the pdf 
evolves towards the form p(x,oo ) cx exp(— kx 2 /2kT). The limit of (A s ( ) for t — > oo 
when equilibrium is restored is then 


(A s t 


= -T I-1 + In -H = - 


k (T, 


T 

In — 


(17.24) 


since k(t -> oo) = k, and this depends on the initial and final temperatures in exactly the 
same way as we found for the cooling of the ideal gas in (2.30). The cycle is complete. 


17.4 Entropy Production Arising from a Single Random Walk 

There is one major conceptual leap that remains to be made. We have introduced a 
quantity A.v, that is associated with a specific path taken by the oscillator. When averaged 
over all possible paths generated by the dynamics, its value appears to be equal to the 
thermodynamic entropy produced in the nonquasistatic relaxation process initiated by 
the release of a constraint. So A,v ; is sampled from a distribution whose mean is A S t . 

The conceptual leap is to consider that A.v ( is also an entropy production, but one that 
is associated with a particular outcome of the dynamics, unlike A .S',, which is associated 
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A sp t 



Figure 17.4 A single trajectory x(t ) taken by a Brownian particle has an associated entropy 
production A s t (t) that can rise and fall as time progresses. When averaged over all trajectories it 
increases monotonically in correspondence with the second law. 


with the whole set of possibilities. Except this is an entropy production that does not 
satisfy the second law. We know this because its distribution is not restricted to non¬ 
negative values. If it were, then exp(—As,/k) would always be less than one, and yet 
the integral fluctuation relation (17.7) requires the average of this quantity to be unity. 
The entropy production of individual paths fluctuates, and can occasionally be negative. 
This is illustrated in Figure 17.4. 

There are advantages in taking this view because it removes some of the conceptual 
difficulties that surround entropy production. With this interpretation, the entropy pro¬ 
duction is just a quantity associated with a particular path along which a system might 
develop in time. It is not necessary to imagine repeating an observation and taking an 
average in order to determine the entropy production. A single random walk of a particle 
would do, along with a description of the dynamics of probability that accompany the 
process. The latter is not necessarily a representation of the behaviour of many repeated 
paths, but is a judgement about the probability of a single realisation of the process. As 
the walk proceeds, it clocks up a tally of A.v, entropy production, and on the whole it 
goes up, but sometimes it goes down. 

Clearly A s t is defined in terms of probabilities of various events, and therefore relies 
on employing a stochastic rather than a deterministic model for the dynamics of a system. 
Entropy production is related to the increase in uncertainty, and arguably it will occur 
only if there is randomness in the dynamics. It is natural to take this view because it 
ties in with the original statistical interpretation of entropy proposed by Gibbs. 

An entropy production that can be associated with the path taken by a system through 
its phase space implies that there is a system entropy that depends on the microstate. 
From (16.38), we have 



(17.25) 
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where p re{ is a constant, suggesting that the system entropy of a microstate is s sys = 
—Ic ln(p(x, t)/p Kf ) such that S sys = (s sys ). We shall not pursue this any further, but again 
it is conceptually useful to recognise that we can conceive of an entropy of a system 
that does not require averaging over different arrangements; that is a property even of a 
microstate. 

The entropy production A s, is related to other path-dependent quantities. We have 
already seen this explicitly in (17.14) in terms of particle positions, but there is an 
identity, which we shall not prove, such that 


T Asf e = Aw - A F, 


(17.26) 


where A sf e is the entropy production for a path undertaken between microstates when 
the system evolves from initial to final equilibrium states, which naturally requires that 
the duration of the process is long enough for relaxation to be completed. Aw is the 
work done on the system over the course of such a path, and A F is the associated 
difference in system free energy. This is a counterpart of the result T A .S', = All 7 — A F 
for an isothermal nonquasistatic work process in classical thermodynamics derived in 
(3.22). The average of Aw over all possible behaviours of the system, each of which 
draws different amounts of work from the environment, is equal to AW. This connection 
will be explored further in the next section. 

17.5 Further Fluctuation Relations 

From the integral fluctuation relation (17.7) and expression (17.26), we can deduce 
the result 



(17.27) 


for a process where the system starts and finishes in equilibrium. Let us focus on a 
process specified by a time-dependent change in the confining volume while the system 
remains in contact with a reservoir at constant temperature. The change in free energy 
A F associated with the process will depend on the initial and final volumes. The work 
done will depend on the history of compressions and expansions, as well as the detailed 
path followed by the system over the course of the process. 

We now imagine starting with the system in equilibrium, imposing such a sequence 
of volume changes, and then continuing for an extended period of relaxation at constant 
volume until equilibrium is restored. Equation (17.27) holds for such a process, but 
we recognise that Aw and A F no longer evolve once the volume stops changing. 
This has the implication that the period of relaxation does not change the value of 
(exp(—(Aw — A F)/kT)) and hence (17.27) holds at an arbitrary point in the process, 
not just at a final equilibrium. 

The expression (17.27) is called the Jarzynski equality. It has more practical impli¬ 
cations than the integral fluctuation relation (17.7) because it involves readily measured 
quantities, namely work and free energy, instead of the more ethereal entropy production. 

We can see the Jarzynski equality operating in practice for the harmonic oscillator 
example. We take the initial spring constant of the oscillator to be k 0 , such that the 
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Figure 17.5 The pdf p(Aw) of work done on a system arising from a step change in spring 
constant from k 0 = 1 to k = 2 at t — 0, starting in equilibrium. The average of Aw is greater than 
the change in free energy, as implied by the Jarzynski equality (17.27). 


Gaussian p(x 0 , 0) oc exp(— K^x^/lkT) specified in (17.10) is the initial canonical equi¬ 
librium pdf. Then we consider a process consisting of a step change in spring constant 
from k 0 to k at t — 0, as illustrated in Figure 17.5. 

The work done on the system is the input of potential energy over the course of the 
process, and this is entirely performed at the shift in spring constant at t = 0. We have 

Aw = \(k — k 0 )x q, (17.28) 


and using (17.18), the mean work is 

1 , 1 (at — KrAkT 

AW = (Aw) = -Or - k 0 )(x~) = -4 -. (17.29) 

2 2 k 0 

The difference in the free energy corresponding to the change in spring constant from 
k 0 to k is A F = — kT ln(Z(/r)/Z(/r 0 )) = {\/2)kT ln(/c//<r 0 ), where Z(k) = kT/Hco is the 
canonical partition function of the 1-d classical harmonic oscillator given in (14.3), using 
Co — ( k/ m )'/ 2 . We can therefore establish that the quantity that appears in the Jarzynski 
equality takes the form 


Aw — A F — -kT 
2 


/ \ xk K 

{k ~ k 0 ) jTT - ln — 
kT Kq 


(17.30) 


and this is equal to T As t (x, oo; x 0 ,0) from (17.14), bearing in mind that k —> k as 
t —> oo. It is then clear that the averaged result AW — A F = 7’(A.v, (v, oo; x 0 ,0)> also 
holds. Furthermore, 

AW — AF — (Aw) - A F = -kT ( K ~ K ° - ln — ) > 0, (17.31) 

2 \ K Q Kq ) 


in explicit agreement with the classical second law (3.22). 
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We can now verify the Jarzynski equality: 
(exp(—A w/kT)) — J dx 0 p(x 0 ,0) exp | 

“( w ) 2 /* 0 

= (7) ! = exp (- 


(k - K 0 ) X 0 2 


exp | 
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~kJ 
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K o x o 
' 2 kT 
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(k - K Q ) x 2 


2 kT 


(17.32) 


as required. The averaging is performed over the initial position x 0 alone since the work 
performed in this example takes place at the instant when the spring constant changes. 

Since we know how x 0 is distributed, we can go further and derive the pdf of Aw 
for this case. First, we identify the initial position x 0 that gives rise to work Aw using 
(17.28): 

( 2Aw \ 1 

x 0 = ± - , (17.33) 


and it should be noticed that if k > k 0 we expect only positive values of Aw, and only 
negative values if k < k {) . In the example illustrated in Figure 17.3, therefore, we shall 
find only positive values of work. Next we obtain the pdf of x ( j from the pdf of x 0 
using p(xq)&Xq — p(x 0 )dx 0 so p(x^) = (l/2)p(x 0 )/|x 0 |, taking care to ensure that p{x^) 
is positive, and hence we write 
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which we plot in Figure 17.5. The implication of (17.31) is that (Aw) > A F and this is 
reflected in the distribution, but just as clearly, it is easily feasible that the particle can 
follow a path such that Aw < A F. 

Further results may be proved that add to our understanding of the fluctuations in 
work and entropy production in the course of various processes. However, this is not 
the place to prove them or even to illustrate them with explicit examples; see Further 
Reading. 

The Crooks relation states that the outcome of a sequence of mechanical actions on 
a system in a heat bath is related to the outcome of the reversed sequence of actions. It 
is a connection between the probability density functions of the work Aw performed on 
the system in the course of such forward and backward processes, p F (Aw) and p B (Aw), 
respectively. A forward process might consist of the movement of a piston to compress 
a gas in a cylinder, while the backward process would be the opposite movement to 
expand the gas. The validity of the Crooks relation requires that the system should start 
in canonical equilibrium at the same temperature T for both processes. The relation reads 


/ (Aw-AF)\ 


p B (-Aw) = /? f (Aw) exp 


kT 


(17.35) 
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where A F is the change in free energy of the system associated with the forward pro¬ 
cess, evaluated for example on the basis of an isothermal change in volume of the 
expanded gas. 

According to (17.35), the probability /? F (Aw) that the forward process should require 
work Aw, and the probability p B (—Aw) of receiving the same amount of work from 
the system during the backward process are related to each other. If the work done in 
the forward process is greater than A F, which is overwhelmingly likely for the nonqua¬ 
sistatic processing of macroscopic systems, where we expect classical thermodynamics 
to prevail, then the probability of getting the work back again is exponentially small, 
according to (17.35). The Crooks relation is an expression of the vanishingly small like¬ 
lihood of mechanical reversibility in the nonquasistatic thermodynamic limit. On the 
other hand, in a quasistatic process, we expect Aw to equal A F for any path, in which 
case, the recovery of the work in the reversal of the process is guaranteed. 

A more general result in a similar vein is the detailed fluctuation relation , or the 
equivalent Evans-Searles fluctuation theorem. In the context developed here, this con¬ 
cerns the entropy production brought about by forward and reverse processes, and reads 



(17.36) 


with the stipulation that the forward process followed by the backward process should 
return the pdf of the system to its initial form. While the Crooks relation determines the 
extent to which work put in can be taken out again, (17.36) quantifies the likelihood that 
a reverse operation on a system might negate the entropy generated in a forward process. 
Sketches of entropy production for a forward process and its backward version are shown 
in Figure 17.6. Clearly, the likelihood of observing events that leave the overall entropy 
unchanged is exponentially suppressed, unless the changes in A.v, for each operation are 
small compared with k, and this is itself very unlikely except for microscopic systems, 
or processes that are exceedingly slow. 


A 



Figure 17.6 Entropy production has a positive mean in a forward process and its backward 
counterpart. The probability that the backward process should negate the entropy generated in the 
forward process is exponentially small, according to the detailed fluctuation relation (17.36). 
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17.6 The Fundamental Basis of the Second Law 

The fluctuation relations emerged from studies of the thermodynamic behaviour of small 
systems. Their principal value is that they tell us something about the statistics of work 
done or heat transferred in circumstances when there are significant fluctuations in sys¬ 
tem behaviour. But equally importantly they can provide us with an interpretation of 
entropy production in terms of the statistical reversibility of sequences of events. We 
have explored how this definition can be linked to established thermodynamic concepts 
using a simple example of an oscillator, but the concept can be applied more generally. 

The probability of a particular sequence of events in systems modelled by random 
dynamics can be compared with the probability of the reverse sequence, under the 
reversal of external driving forces, if there are any. We define a quantity Athat is 
positive for the more likely of the two, and negative but of equal magnitude for the other. 
We average the quantity over the two sequences according to their relative likelihoods, 
and naturally the result is positive. We consider all possible sequences and their reversals, 
and the average remains positive. This is thermodynamic entropy production, at least 
within the framework of stochastic or random dynamics. It is an indication that the system 
has a preference to evolve in a certain manner rather than in the opposite direction. 

But hold on. It is believed that real dynamics are fundamentally deterministic and 
reversible in the following sense. If all the atoms in the world had their velocities 
reversed, then the equations of motion would dictate that they retraced their steps: this 
would then guarantee the reversal of all the ‘irreversible’ phenomena observed up to 
that point. This observation was made by Josef Loschmidt (1821-1895) in response 
to Boltzmann’s attempts to relate the second law to Newtonian mechanics. Matters 
are different in a model with stochastic dynamics: we typically find that forward and 
backward sequences of events have different likelihoods, and reversing the velocities does 
not imply that previous behaviour is retraced. An interpretation of entropy production 
based on ratios of path probabilities within a stochastic framework has given us some 
understanding of the second law. But we ought to accept that probabilistic dynamics is a 
model, where our lack of knowledge about, or lack of interest in the complete dynamics 
affecting a system are represented through the coarse graining and randomisation of 
some aspect of the interactions. In a sense, therefore, entropy and its production seem 
not to be as objectively real as some other system variables, but will depend on the way 
we choose to represent the behaviour. With this in mind, in Chapter 18 we once again 
address the question ‘what is entropy?’. 


Exercises 

17.1 Show that (17.11) is a solution of (17.9) and that the solution of (17.12) is (17.13). 

17.2 Using the distribution of work p(Aw), starting in equilibrium, for a step up in 
spring constant from k 0 to k given in (17.34), determine the distribution for a step 
down from k to k {) and hence verify the Crooks relation (17.35). 


18 

Final Remarks 


The main purpose of this book was to describe statistical models of the behaviour of 
physical systems that can account for the laws of classical thermodynamics. In partic¬ 
ular, we needed to develop an interpretation of entropy that leaves as few questions 
unanswered as possible. The tools that emerged from this allow us to enquire into the 
likely microscopic or macroscopic behaviour of a system, principally when it is coupled 
to environments of various kinds. We have considered gases of particles at high and 
low temperatures and densities, with or without pairwise interactions; magnets; vacan¬ 
cies in crystals; harmonic oscillators; solids and electromagnetic radiation. Much more 
can be done with these tools, and there are many studies in the literature describing 
further applications, including the vitally important confrontation between models and 
experimental data. 

The basis of statistical physics is the principle of equal a priori probabilities, which 
states that all microscopic configurations of an isolated system are equally likely to be 
realised when the system is in equilibrium. The system might be likened to a multiple¬ 
faced die that is thrown repeatedly: each face, or each microstate, is imagined to be 
equally likely to come up. The time-averaged properties of an isolated system at equi¬ 
librium are then equal to ensemble averages over all possible microstates. 

The principle is simple and appealing, but it is hard to find an absolutely satisfactory 
justification for it. Perhaps the best argument is that it is sometimes appropriate to model 
parts of the world using stochastic dynamical rules, and that these must convey a system 
into its microstates with equal likelihood if there is no discernible reason why any should 
be favoured over the rest. While this view might be questioned, the conclusions that 
follow seem to be consistent with the equilibrium behaviour of complex physical systems. 

The centrepiece of statistical and classical thermodynamics is the concept of entropy, 
a quantity that has intrigued generations of scientists, and caused no small amount of 
confusion. When equilibrium is disturbed by the lifting of a constraint, such as when 
a partition between boxes containing different gases is removed, an isolated system 
will undergo an increase in entropy: the famous second law of thermodynamics. In all 
processes, except for those that proceed extremely slowly, entropy is generated. The 
entropy of the universe tends towards a maximum. But what is entropy? 
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We have emphasised repeatedly that entropy need not be enveloped in mystery. It is 
a thermodynamic property of a physical system, obtainable from measurements of the 
heat capacity or from the equation of state. We have plotted its value against parameters 
such as temperature for a variety of cases. It would be quite ordinary were it not for 
the second law and the insistence that for an isolated system it can go up but never go 
down. This suggests that entropy is not a quantity that can be described in the same 
manner as particle positions and momenta, which are allowed to reverse their evolution. 
Determining an appropriate interpretation of entropy has been the most puzzling issue 
in the development of statistical physics. 

The meaning of entropy that makes the most sense to me personally is that it expresses 
the uncertainty in the microscopic state of a system. Such uncertainty is an inevitable 
consequence of a failure to include all the details of the dynamics that describe the evo¬ 
lution of the system. This neglect might arise because the universe is very complicated, 
and we only wish to focus on certain features. So when studying the motion of a pen¬ 
dulum, we happily apply Newton’s laws of motion for the bob, but are not inclined to 
consider the motion of the gas molecules that collide with the bob as it swings. Ignoring 
the molecules will neglect the fact that energy will leak away from the pendulum; so we 
have to do something or we cannot account for the fact that it will eventually appear to 
stop moving. One way is to represent the dissipation of energy in a random, uncertain 
way. It is at this level of description that a quantity can arise that does not retrace its 
steps. It is microscopic uncertainty itself, and entropy is its measure. 

In short, if we know the present microscopic state of only parts of the universe, or 
equivalently know only some of the dynamical rules that control its evolution, then our 
certainty about the state of the universe will naturally decline in the future, and this is 
the second law. 

Is entropy fundamental? Not in the same way as energy or particle number. It is an 
emergent property of a system, meaning that it shows up when we have to deal with 
a complex system in a coarse-grained way. A single particle or set of particles, with 
known coordinates, does not possess entropy until it is coupled to a coarsely specified 
environment for a period of time, whereby the uncertainty in the initial state of the 
environment leads to uncertainty in the state of the system. A system does not possess 
entropy if we know precisely where all the particles are, and how fast they are going. 
But typically we do not possess all the microscopic information about a complex system, 
and the missing information is essentially its entropy. It exists because we neglect details 
in our models: it is emergent. The most remarkable thing is that microscopic uncertainty 
can be measured with thermometers and pressure gauges. 

So entropy production is a reflection of the loss of information (an increase in missing 
information) about the microscopic state of a system as time progresses, when we are 
able to follow the evolution only on a macroscopic, coarse-grained scale. The definition 
of entropy production given in Chapter 17 in terms of the probabilities of observing a 
certain sequence of events and its reverse has a particular resonance, since it connects 
directly with reversibility. The law of increasing entropy may perhaps be regarded as 
simply a shorthand, or a slogan, to describe the increase in uncertainty arising from the 
roughly modelled dynamics of complex systems. Because it is expressed in the form of 
a law, it can offer an intuitive understanding of many kinds of behaviour. 


Final Remarks 257 


A line of enquiry that reveals the connection between entropy and information involves 
a character known as Maxwell’s Demon and it would be remiss to leave him out of the 
discussion. In trying to understand the meaning of the second law, Maxwell imagined 
that a tiny creature could observe the motion of particles within a container and, by 
judicious use of a trapdoor in a partition placed across the middle, use this information 
to organise the particles between the two subvolumes just as he desired. For example, 
faster particles could be allowed to propagate into the left hand side and slower ones 
directed towards the right, producing a separation of a gas into hot and cold parts without 
doing work (assuming the trapdoor and the observation require none) which would be 
in violation of the second law! 

The implications of these actions have been discussed at length. Maxwell’s motivation 
was to show that the second law was statistical in nature: even if no intelligence was 
at work, a trapdoor flapping open and shut at random could conceivably produce such 
a separation, except it would be incredibly unlikely and extremely short-lived. But from 
the standpoint of equating entropy with uncertainty, the Demon is nothing more than an 
experimenter who makes a microscopic measurement and is thereby able to reduce his 
uncertainty about the world. He is acquiring data, or reducing the missing information, 
and is thereby able to narrow down the pdf of a few microscopic variables of the system. 
Even if he took no action, simply making the observation reduces the uncertainty in the 
system microstate, albeit temporarily, since the dynamics would presumably then proceed 
unobserved in a manner that is difficult to predict. I do a similar thing when I open a 
door briefly to find out what is happening on the other side. Opening the trapdoor to 
allow a fast particle to pass through preserves the reduction in missing information, and 
the result is a fall in the thermodynamic entropy of the gas. 

The debate about the Demon revolves around how such events are consistent with 
the second law. One resolution suggests that the microscopic state of the Demon before 
the process is known, but after he has completed his task his state is less certain. He 
has sorted the gas into a state of reduced microscopic uncertainty, but from the point of 
view of an observer who does not have access to the microscopic state of the Demon, he 
has acquired some uncertainty himself. He is the repository of an unspecified stream of 
data about the sorting process; in short he has a memory. An analysis of how he might 
be returned to his initial state reveals that external work has to be dissipated as heat 
to the environment. An alternative resolution is that obtaining the initial microscopic 
information fundamentally requires the Demon to perform work, which is also to be 
dissipated as heat, with consequent entropy production. The debate is still ongoing. 

Entropy has been related to uncertainty since the time of Gibbs, who represented it in 
terms of a probability distribution, which is a specification of uncertainty. The natural 
question is ‘whose uncertainty?’ My uncertainty might be different from yours. I might 
have made more measurements of system parameters. It is natural to be suspicious of 
a quantity when its value doesn’t seem to be objective, but rather depends on how 
closely an individual, in the same manner as the Demon, decides to investigate and to 
measure. After all, entropy appears in thermodynamics alongside quantities that do have 
an objective reality, such as energy. No wonder entropy causes so much confusion! 

But the suspicion is unfounded. We develop thermodynamics within a framework 
of a chosen set of macroscopic system variables, and the entropy that comes into the 
discussion expresses microscopic uncertainty within that framework. If we decide to 
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make further measurements and identify, for example, a macroscopic parameter such as 
magnetisation, then we are required to define entropy within such a broadened frame¬ 
work making it a function of the new state variable as well as the old ones. As long 
as everyone is agreed on the level or the coarseness of the description, entropy is a 
well-defined measure of the remaining uncertainty, since it is to be calculated on the 
basis of measurements, such as heat capacities, made within that framework. 

Of course, the framework might be insufficient; for example, if we neglect 
magnetisation, there will be physical effects that we cannot explain, and indeed 
phenomena that might even suggest that the entropy of an isolated system goes down. 
Such is our confidence in the second law that this usually means that the neglected 
macroscopic parameter needs to be incorporated into the thermodynamic framework to 
make everything work out satisfactorily. 

The novelist and scientist C.P. Snow considered that entropy and the second law of 
thermodynamics deserve to be more widely appreciated by society. In his book The Two 
Cultures and the Scientific Revolution , he famously wrote 

‘A good many times I have been present at gatherings of people who, by the standards of 
the traditional culture, are thought highly educated and who have with considerable gusto 
been expressing their incredulity at the illiteracy of scientists. Once or twice I have been 
provoked and have asked the company how many of them could describe the second law of 
Thermodynamics. The response was cold: it was also negative. Yet I was asking something 
that is the scientific equivalent of: Have you read a work of Shakespeare’s?’ 

Entropy and its increase are indeed culturally important because they present us with 
a deep perspective on events in the world. Every macroscopic thermodynamic process, 
indeed any macroscopic event we can conceive of generates entropy as a result of the 
natural tendency for energy to be shared equitably between interacting particles. This 
is equivalent to saying that every process is macroscopically irreversible: the sharing is 
unlikely to be undone. Such a view leads us to conclude that the universe is heading 
towards a state of complete entropy maximisation, known as the heat death, where matter 
and energy are uniformly spread out, where no change is perceptible on the macroscopic 
scale, and where ‘time’, or at least evolution, appears to have ended. 

The future might be bleak, but at least it is straightforward to comprehend. It is rather 
more puzzling to understand why we presently exist in a relatively low entropy universe, 
such that the driving forces for changes on the macroscopic scale are not yet exhausted. 
The most compelling interpretation is that the low entropy is a remnant feature of the Big 
Bang. Since that event, the universe has undergone an expansion and cooling analogous 
to the free expansion of a gas, but luckily for us there is still more energy conversion 
and sharing to be done. 

It will be clear by now that I prefer to view entropy as microscopic uncertainty, and 
not to rely too heavily on the traditional interpretation in terms of disorder. I find it 
hard to consider the universe to be disordered in a way that can easily be defined, but it 
certainly is disorderly, and evolving in a tremendously complex way on various spatial 
and temporal scales 1 . The proper understanding of such behaviour is to be found in the 
analysis of the effective equations of motion that apply at each scale, but common to 

1 One of the advantages of being disorderly is that one is constantly making exciting discoveries. - A. A. Milne 
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most of them is the property that the overall thermodynamic entropy increases with time. 
The intrinsic disorderliness can mix and share the constituents of the universe, exploring 
new modes of behaviour, with the consequence that solving the equations of motion is 
a difficult task. We can associate entropy with the uncertainty in our current perception 
of a physical system, and the second law with the inevitable increase of this uncertainty 
into the future if we are obliged to employ such effective equations of motion. 

And finally, we should note that the universe is not simply following a programme 
of decay and decline. Organised structures, including life itself, have arisen from the 
original impetus of the Big Bang and from the rules of dynamics that have been in play. 
These have developed or are maintained in a manner that can be rationalised in terms 
of overall entropy increase. The impetus ought to run out eventually, of course, and 
it would seem that a gloomy future associated with the heat death does lie ahead, but 
the universe has nevertheless burned very, very brightly, and the second law is, in some 
sense, a celebration of this behaviour, and not just a sign warning us of impending doom. 
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